EFFICIENCY CONSIDERATIONS BETWEEN COMMON WEB APPLICATIONS USING THE SOAP PROTOCOL



Similar documents
Efficiency Considerations of PERL and Python in Distributed Processing

Efficiency of Web Based SAX XML Distributed Processing

Virtual Credit Card Processing System

Integrated Open-Source Geophysical Processing and Visualization

GenericServ, a Generic Server for Web Application Development

PIE. Internal Structure

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

LinuxWorld Conference & Expo Server Farms and XML Web Services

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

1. Overview of the Java Language

CatDV Pro Workgroup Serve r

Software: Systems and. Application Software. Software and Hardware. Types of Software. Software can represent 75% or more of the total cost of an IS.

PERFORMANCE COMPARISON OF COMMON OBJECT REQUEST BROKER ARCHITECTURE(CORBA) VS JAVA MESSAGING SERVICE(JMS) BY TEAM SCALABLE

Document management and exchange system supporting education process

A standards-based approach to application integration

Web Services for Management Perl Library VMware ESX Server 3.5, VMware ESX Server 3i version 3.5, and VMware VirtualCenter 2.5

ISM/ISC Middleware Module

System Structures. Services Interface Structure

Functions of NOS Overview of NOS Characteristics Differences Between PC and a NOS Multiuser, Multitasking, and Multiprocessor Systems NOS Server

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

Network operating systems typically are used to run computers that act as servers. They provide the capabilities required for network operation.

School of Computer Science

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Introduction to Web Services

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Building Applications Using Micro Focus COBOL

Content Distribution Management

Scalability and Classifications

WEB SERVICES. Revised 9/29/2015

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

Middleware Lou Somers

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Building an efficient and inexpensive PACS system. OsiriX - dcm4chee - JPEG2000

Chapter 4. Architecture. Table of Contents. J2EE Technology Application Servers. Application Models

Automatic Configuration and Service Discovery for Networked Smart Devices

Information and Communications Technology Courses at a Glance

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

Kernel Types System Calls. Operating Systems. Autumn 2013 CS4023

Research and Design of Universal and Open Software Development Platform for Digital Home

SIP Protocol as a Communication Bus to Control Embedded Devices

SOAP - A SECURE AND RELIABLE CLIENT-SERVER COMMUNICATION FRAMEWORK. Marin Lungu, Dan Ovidiu Andrei, Lucian - Florentin Barbulescu

Enterprise Application Integration

Lesson 18 Web Services and. Service Oriented Architectures

Universidad Simón Bolívar

Oracle Desktop Virtualization

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

System Models for Distributed and Cloud Computing

Performance Testing Tools: A Comparative Analysis

Module 17. Client-Server Software Development. Version 2 CSE IIT, Kharagpur

Enhancing A Software Testing Tool to Validate the Web Services

DIABLO VALLEY COLLEGE CATALOG

How to Send Video Images Through Internet

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Chapter 1 - Web Server Management and Cluster Topology

Classic Grid Architecture

In: Proceedings of RECPAD th Portuguese Conference on Pattern Recognition June 27th- 28th, 2002 Aveiro, Portugal

Introduction to Service Oriented Architectures (SOA)

Six Strategies for Building High Performance SOA Applications

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

An On-line Backup Function for a Clustered NAS System (X-NAS)

The Service Revolution software engineering without programming languages

A Survey on Availability and Scalability Requirements in Middleware Service Platform

COMP[29]041 - Software Construction 15s2. Course Goals. Course Goals. 41 Introduction 15s2]COMP[29]041 Introduction 15s2

Masters in Human Computer Interaction

Masters in Advanced Computer Science

Masters in Artificial Intelligence

Motivation Definitions EAI Architectures Elements Integration Technologies. Part I. EAI: Foundations, Concepts, and Architectures

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

EAI OVERVIEW OF ENTERPRISE APPLICATION INTEGRATION CONCEPTS AND ARCHITECTURES. Enterprise Application Integration. Peter R. Egli INDIGOO.

Software: Systems and Application Software

What is a Web service?

Java in Education. Choosing appropriate tool for creating multimedia is the first step in multimedia design

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping

Virtual Machines.

Avaya Aura Orchestration Designer

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

Introduction into Web Services (WS)

W3Perl A free logfile analyzer

SOFTWARE ENGINEERING PROGRAM

i.sight ecommerce system

Example of Standard API

How To Install Linux Titan

Shared Parallel File System

Middleware- Driven Mobile Applications

Driving force. What future software needs. Potential research topics

1/20/2016 INTRODUCTION

Lecture 26 Enterprise Internet Computing 1. Enterprise computing 2. Enterprise Internet computing 3. Natures of enterprise computing 4.

1. INTERFACE ENHANCEMENTS 2. REPORTING ENHANCEMENTS

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Protect Your Cisco UCS Domain with Symantec NetBackup

Qint Software - Technical White Paper

How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip

Grid Computing Vs. Cloud Computing

4D and SQL Server: Powerful Flexibility

Detailed Design Report

School of Computer Science

Extending Desktop Applications to the Web

Transcription:

EFFICIENCY CONSIDERATIONS BETWEEN COMMON WEB APPLICATIONS USING THE SOAP PROTOCOL Roger Eggen, Sanjay Ahuja, Paul Elliott Computer and Information Sciences University of North Florida Jacksonville, FL 32224 USA ree, sahuja, ellp0001@unf.edu Maurice Eggen Department of Computer Science Trinity University San Antonio, TX 78212 USA meggen@trinity.edu ABSTRACT Applications can communicate over the WEB using a variety to tools and techniques. Both Java and PERL are common languages used in WEB communication. A Simple Object Access Protocol (SOAP) is used within the PERL and Java languages to determine which is the most efficient communication technique. KEYWORDS PERL, Java, and Distributed Applications 1. Introduction WEB communication can occur in a variety of ways using different techniques and tools. Distributed processing belongs to one of the most important areas of research in our modern computing world. In this paper, we discuss the use of PERL (Practical Extraction and Report Language) compared to Java both using SOAP (Simple Object Access Protocol). PERL is an easy to learn quick prototyping language and is used in many WEB applications. We then introduce Java as a tool that supports the same communication protocol between workers, that of SOAP. We compare the speed and efficiency of PERL and Java as they implement the SOAP protocol. 2. Fundamentals A distributed system is a collection of independent processors interconnected by a network and capable of cooperating to achieve the solution of a task. The task could be widely varied including motion, object detection, recognition, or a wide variety of other tasks that cooperate to arrive at a desired conclusion. By distributing the tasks to a wide variety of processors, we reduce the amount of processing at a central site. The system is distributed, data is decentralized, and computation and communication among workers asynchronous. Each worker will process the data related to its current configuration and report the results to a the central processor. The processing workers are considered loosely coupled if they do not share memory and have individual processing power, hence the idea of parallel distributed processing across the WEB. Important strides have been made recently to provide environments that make distributed computing more common and easier to implement. The message passing interface (MPI), parallel virtual machine (PVM), Common Object Request Broker Architecture (CORBA), Java s Remote Method Invocation (RMI), and others have extended ordinary languages to provide methods (functions, procedures) facilitating communication between distributed computing workers. PERL and Java are often used in distributed WEB applications. These languages are easy to program and have features that support distributed communication. This paper studies the SOAP protocol for Java and PERL. We compare features and efficiency of both in distributed communication. 3. PERL PERL is an open source, cross-platform programming language capable of performing a wide variety of applications. The word PERL is an acronym for Practical Extraction and Report Language. Used in both the public and private sectors, PERL is supported by UNIX, Macintosh, Windows and many more operating systems. PERL comprises the best features of many languages, including C, awk, sed, sh and BASIC. PERL works with

HTML, XML, and other markup languages to support unicode and both procedural and object oriented paradigms. PERL is extensible. Because of its excellent text processing capabilities, PERL is perhaps the most widely used Web or distributed programming language. Applications written using the PERL language include, but are not limited to the following: Application servers Artificial intelligence algorithms Astronomy Audio Bioinformatics Compression and encryption Content management systems (for both small and large scale Web sites) Database interfaces Date/Time Processing ecommerce Email processing GUI development Generic algorithms Graphing and charting Image processing Mathematical and statistical programming Natural language processing (in English, Chinese, Japanese, and Finnish, among others) Network programming Operating-system integration with Windows, Solaris, Linux, Mac OS, etc. PERL/Apache integration Spam identification Software testing Templating systems Prototyping for fast development Text processing Web services, Web clients, and Web servers XML/HTML processing [1] The widely available free modules from CPAN (www.cpan.org) support PERL applications development. These modules make PERL a good choice for development of distributed systems. 4. Java Java is easy to learn and an effective approach for meeting the requirements of distributed processing. The object oriented capabilities of Java make it a robust, efficient, and therefore desirable environment in which to program the parallel communication The Java interpreter and the extensive standard library are freely available in source or binary form for all major platforms from java.sun.com. The same site also contains programs, tools, and additional documentation. Java is object oriented with a class structure that encourages reuse. Java supports single inheritance, polymorphism, and all the other features expected of an object oriented language. The Java Virtual Machine is easily extended with new functions and data types implemented in either Java or by native routines that can be implemented in C or C++ (or other languages that can be invoked from C). 5. SOAP The XML syntax of a SOAP message is uncomplicated. A SOAP message consists of an envelope containing: 1) an optional header containing zero or more header entries (sometimes ambiguously referred to as headers), 2) a body containing zero or more body entries, and 3) zero or more additional, non-standard elements. The only body entry defined by SOAP is a SOAP fault which is used for reporting errors. Some of the XML elements of a SOAP message define namespaces, each in terms of a URL and a local name, and encoding styles; a standard one is defined by SOAP. Header entries may be tagged with the following optional SOAP attributes: 1) an actor which specifies the intended recipient of the header entry in terms of a URL, and 2) an indication that specifies whether or not the intended recipient of the header entry is required to process the header entry. 6. Hardware The hardware for evaluating SOAP efficiency consists of a Beowulf cluster of computers all running RedHat linux v9. The machines are 0.5 ghz machines with 512 megabytes of main memory connected by gigabit fast ethernet. 7. Software The software consists of Red Hat Linux 9 core 3.2.2-4, PERL 5.8.3 installed with threading, and Java HotSpot 1.4.1_01. SOAP::Lite 1.2 for PERL was used in the PERL communication and JAXRPC, jwsdp-1_3_01, was used in the Java communication. Only Linux machines were involved to remove discrepancies caused by different operating systems. 8. Programming a Distributed System Regardless of the language, programming a distributed system, involves one of several fundamentals. Typically, one of three scenarios are involved: boss/worker where the boss distributes a portion of the work uniquely,

worker crew where each processor does essentially the same processing on different portions of the data, often referred to as single instruction multiple data (SIMD), and a pipeline where each worker processes a portion of the data, passing that partial solution to the next worker. In this research we use the boss/worker environment as shown in Figure 1. A boss process defines the task to be accomplished and decides the method of distribution. Several worker processes do the actual work. The worker starts by receiving input from the central server (boss). The boss is the controlling process that has the responsibility of the overall system. A portion of the task is given to each of the workers. The boss waits to receive each of the completed tasks from the individual workers. Each of the workers execute in parallel. worker worker... worker 9. Evaluation Boss Figure 1 Partial Communication Dependency The purpose of this research is to determine how much a user gains (or gives up) to enjoy the ease of programming language PERL versus Java in a distributed environment using SOAP. We chose a simple sorting algorithm as the activity performed by each worker. To assess relative efficiency, we held the activity at each of the workers relatively constant. We programmed an n 2 sort so reasonable time would be taken on each of the worker processors, creating a measure of both communication and computation times. Boss and workers are programmed using Java and PERL using the SOAP application programming interface. The PERL side of the test used middleware called SOAP::Lite, while the Java side of the test used JAXRPC. Each application is implemented with exactly the same functionality. The boss machine creates sets of integers whose cardinality vary between 1,000 and 256,000. Each set was evenly divided and distributed among the worker machines. Each worker sorted their portion of the data, yielding a highly parallel solution. The communication from boss to worker to boss was handled through the SOAP protocol. As is often the case, the boss did not participate in the primary task, but rather distributed the work and waited for results. Timings included establishing communication, dividing the data, distributing the data, sorting each data set, receiving the sorted results from each worker, and merging to arrive at a totally sorted set of data. 10. Results We ran datasets of sizes 1000, 2000, 4000, 8000, 16000, 32000, 64000, 128000, and 256000. Each were sorted and timed. The timing results are in Figure 2. The Figure shows a clear cross over between communication and processing dominating the timings. PERL processes much slower than Java, thus when significant amounts of data are present at each worker, PERL requires more time. But, PERL establishes communication faster than Java. Figure 3 more clearly shows the relative efficiency for 4 workers. Some comments related to PERL are in order. In our preliminary tests, we found that the efficiency of PERL is significantly less than that of Java. A variety of considerations were employed to increase PERL s efficiency, each with limited success. PERL 5.8.0 ran 90% as fast as PERL 5.8.3, so one should certainly ensure the latest version of PERL is used. Also, threading is relatively new to PERL, only coming into existence in the 5.8.x versions. Threaded PERL 5.8.3 runs 96% as fast as a non-threaded installation. In a distributed application, installation of a threaded version of PERL is essential. PERL 6 is soon to be released and hopefully, efficiency will be addressed. With these considerations in mind, we implemented the PERL application by forking new processes rather than using threads. PERL s forking mechanism is mature and well developed, therefore, until PERL 6 is available which we believe will provide more stable threading, forking is the method of choice. The PERL application can be compiled to native code using perlcc. Perlcc is in a nonstable state and should not be used for production applications, but it did generate native code. However, the native code executed only 92% as fast as the code directly interpreted. We expect a significant increase in performance when perlcc is enhanced. A less defined measure is programmer productivity. We carefully monitored our development time for each of the applications. While this is somewhat arbitrary, ill defined, and rather qualitative, we believe our experience is suggestive of the nature one can expect when considering these environments. Upon evaluation of our respective experiences in each environment, we believe programmer time was most efficient in PERL and with development of the Java application approximately 30% longer. We realize this consideration is highly arbitrary, but believe programmer productivity is a significant factor in cost of application development.

TIME seconds Workers 2 4 8 Data Size PERL Java PERL Java PERL Java 1000 8 13 4 13 5 12 2000 14 13 6 13 6 13 4000 50 13 13 13 9 13 8000 115 15 34 14 16 14 16000 347 21 108 17 42 16 32000 2980 49 894 23 451 19 64000 9257 167 5682 51 3977 27 128000 17787 696 11612 202 8990 56 256000 74705 2830 46448 1176 21994 181 Figure 2 Efficiency Timings 50000 45000 However, when developer productivity, ease of programming, portability, and prototyping are desired, PERL and Java are viable alternatives. PERL and Java support a robust and convenient API for implementing distributed, Web based, or distributed intelligent agent applications. 40000 35000 30000 25000 20000 15000 10000 5000 0 1k 4k 16k 64k 256k Figure 3 Rate of Growth Perl Java Java s communication is efficient, which accounts for some of its performance. Also, interpreting byte code is more efficient than pure interpretation. Note, each application scales appropriately with the size of the data set. We note that initially PERL is more efficient than Java in establishing communication. This accounts for the lower timings PERL received when small amounts of data is present. However, Java quickly out performs PERL when more computation is required. PERL is quite inefficient when any degree of processing is required. This leads to a decision for software developers. If establishing communication is the primary purpose of the application, PERL is a very good choice. If any significant amount of processing is required after communication is established, Java is the language of choice. 11. Summary and Conclusions The above timings were taken during a period in which the network and processors were not used by others. If the fastest application processing possible is the goal, PERL or Java would not be the choice. PVM or MPI with native routines written in C will outperform all the above applications. 12. Bibliography/References [1] http://www.perl.org/ [2] http://java.sun.org [3] Wall, L., Christiansen, T., and Orwant, J. Programming Perl. O REILLY (2000).

[4] Liu, M. Distributed Computing. Addison Wesley. Boston (2004). [5] Stein, L. Network Programming with Perl. Addison Wesley. Boston (2001). [6] Christiansen, T. and Torkington, N. Perl Cookbook. O REILLY. Sebastopol, CA (1998). [7] Schwartz, R. Speeding up Your Perl Programs. Sys Admin CMP Media LLC. November 2003, pp. 43-44. [8] Russell, S., Norvig, P. Artificial Intelligence A Modern Approach. Prentice Hall. New Jersey (2003).