Performance Analysis and Software Optimization on Systems Using the LAN91C111



Similar documents
TECHNICAL NOTE 7-1. ARCNET Cable Length vs. Number of Nodes for the HYC9088 in Coaxial Bus Topology By: Daniel Kilcourse May 1991.

High-Speed Inter-Chip (HSIC) USB 2.0 to 10/100 Ethernet

EMC6D103S. Fan Control Device with High Frequency PWM Support and Hardware Monitoring Features PRODUCT FEATURES ORDER NUMBERS: Data Brief

2 How to Set the Firewall when Using OptoLyzer Suite?

AN SMSC Design Guide for Power Over Ethernet Applications. 1 Introduction. 1.1 Power Over Ethernet

LAN9514/LAN9514i. USB 2.0 Hub and 10/100 Ethernet Controller PRODUCT FEATURES PRODUCT PREVIEW. Highlights. Target Applications.

USB2229/USB th Generation Hi-Speed USB Flash Media and IrDA Controller with Integrated Card Power FETs PRODUCT FEATURES.

LAN7500/LAN7500i. Hi-Speed USB 2.0 to 10/100/1000 Ethernet Controller PRODUCT FEATURES PRODUCT PREVIEW. Highlights. Target Applications.

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Using the RS232 serial evaluation boards on a USB port

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Understanding LCD Memory and Bus Bandwidth Requirements ColdFire, LCD, and Crossbar Switch

Open Source Used In Cisco Instant Connect for ios Devices 4.9(1)

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

TERMS AND CONDITIONS

CITRIX SYSTEMS, INC. SOFTWARE LICENSE AGREEMENT

Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB _v02

AGREEMENT BETWEEN USER AND International Network of Spinal Cord Injury Nurses

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

AN AES encryption and decryption software on LPC microcontrollers. Document information

C. System Requirements. Apple Software is supported only on Apple-branded hardware that meets specified system requirements as indicated by Apple.

AGREEMENT BETWEEN USER AND Global Clinical Research Management, Inc.

Bandwidth Calculations for SA-1100 Processor LCD Displays

Terms & Conditions Template

The Credit Control, LLC Web Site is comprised of various Web pages operated by Credit Control, LLC.

BACKUPPRO TERMS OF USE AND END USER LICENSE AGREEMENT

Website Hosting Agreement

Windows Server Performance Monitoring

Cyberoam Multi link Implementation Guide Version 9

HEF4011B. 1. General description. 2. Features and benefits. 3. Ordering information. 4. Functional diagram. Quad 2-input NAND gate

TC Electronic Extended Warranty Policy

Intel SSD 520 Series Specification Update

Steps to Migrating to a Private Cloud

Storage Solutions Overview. Benefits of iscsi Implementation. Abstract

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

Compatibility Matrix. VPN Authentication by BlackBerry. Version 1.7.1

BlackBerry Business Cloud Services. Version: Release Notes

CKEditor for Drupal License Agreement

BlackBerry Professional Software For Microsoft Exchange Compatibility Matrix January 30, 2009

Fiber Channel Over Ethernet (FCoE)

Intel X38 Express Chipset Memory Technology and Configuration Guide

AN LPC24XX external memory bus example. Document information

MTConnect Institute Public Comment and Evaluation License Agreement

16-Port and 24-Port 10/100 Switches

VATSIM USER AGREEMENT

Breach Found. Did It Hurt?

Web Hosting Agreement

AGREEMENT BETWEEN USER AND Caduceon Environmental Laboratories Customer Portal

Intel RAID Controllers

MDM Zinc 3.0 End User License Agreement (EULA)

Configuring RAID for Optimal Performance

CKEditor - Enterprise OEM License

ENTERPRISE EDITION INSTALLER END USER LICENCE AGREEMENT THIS AGREEMENT CONSISTS OF THREE PARTS:

COSBench: A benchmark Tool for Cloud Object Storage Services. Jiangang.Duan@intel.com

MySeoNetwork Reseller Agreement -Revised June 2, (800) ; (410)

ENHANCED HOST CONTROLLER INTERFACE SPECIFICATION FOR UNIVERSAL SERIAL BUS (USB) ADOPTERS AGREEMENT

AN Boot mode jumper settings for LPC1800 and LPC4300. Document information

FILEMAKER PRO ADVANCED SOFTWARE LICENSE

MOST Training and Workshops

TERMS AND CONDITIONS

Installation Guide for. 10/100 to Triple-speed Port Aggregator. Model TPA-CU Doc. PUBTPACUU Rev. 1, 12/08. In-Line

TestFlight FAQ Apple Inc.

CORPORATE HEADQUARTERS Elitecore Technologies Ltd. 904 Silicon Tower, Off. C.G. Road, Ahmedabad , INDIA

GENOA, a QoL HEALTHCARE COMPANY GENOA ONLINE SYSTEM TERMS OF USE

Installation Guide for GigaBit Fiber Port Aggregator Tap with SFP Monitor Ports

1. GRANT OF LICENSE. Acunetix Ltd. grants you the following rights provided that you comply with all terms and conditions of this EULA:

A Superior Hardware Platform for Server Virtualization

1-of-4 decoder/demultiplexer

z/os V1R11 Communications Server System management and monitoring Network management interface enhancements

Website & Hosting Terms & Conditions

Intel Desktop Board DP55WB

Intel Desktop Board DG41WV

an analysis of RAID 5DP

Website TERMS OF USE AND CONDITIONS

AMERICAN INSTITUTES FOR RESEARCH OPEN SOURCE SOFTWARE LICENSE

prevailing of JAMS/Endispute. The arbitrator's award shall be binding and may be entered as a judgment in any court of competent jurisdiction.

Proactive Performance Management for Enterprise Databases

How To Monitor And Test An Ethernet Network On A Computer Or Network Card

Seaside High-Speed Internet Acceptable Use Policy (AUP)

E-Sign Disclosure & E-Statements Terms and Conditions

Self Help Guides. Setup Exchange with Outlook

Long Island IVF Terms and Conditions of Use

Please read these Terms and Conditions carefully. They Govern your access and use of our Website and services on it.

Question: 3 When using Application Intelligence, Server Time may be defined as.

A Dell Technical White Paper Dell PowerConnect Team

PLEASE READ CAREFULLY BEFORE DOWNLOADING OR STREAMING THIS APP.

SOFTWARE LICENSE LIMITED WARRANTY

Web Site Hosting Service Agreement

Release Notes. BlackBerry Web Services. Version 12.1

Virtual LAN Configuration Guide Version 9

Lecture 8 Performance Measurements and Metrics. Performance Metrics. Outline. Performance Metrics. Performance Metrics Performance Measurements

COM Port Stress Test

Intel Desktop Board D945GCPE

AVR32701: AVR32AP7 USB Performance. 32-bit Microcontrollers. Application Note. Features. 1 Introduction

BlackBerry Enterprise Server Express for Microsoft Exchange

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide

Transcription:

AN 10.12 Performance Analysis and Software Optimization on Systems Using the LAN91C111 1 Introduction This application note describes one approach to analyzing the performance of a LAN91C111 implementation and fine-tuning the software to increase system throughput. Often, the user looks to the device driver for an answer to boost the overall Ethernet packet processing (receiving) speed in the system. While optimizations in the driver may well be sufficient to enhance the performance so that the system throughput requirements can be met, it by no means can be guaranteed. It is useful to understand all parameters that play a part in affecting the observed system performance. It is important to understand where the bottleneck is, and it is important to recognize the point where one has done all one could in software optimization. 1.1 Some Queuing Theory The first step to establishing a framework for calculating the theoretical throughput is to adopt a mathematical model used commonly in network performance analysis. For this, the author went back to a textbook he used years ago. The reference to the textbook is given at the end of this application note. There are lots, and lots, of books on the subject of queuing theory. Only the results of the applicable model are presented here. If the reader is interested in the derivation, please enjoy yourself by referring to one of these books. The mathematical model that is most applicable to our case is known as an Μ/Μ/1/Κ queuing system. This notation denotes a system as follows: λ µ K Figure 1.1 Μ/Μ/1/Κ Queuing System 1. There is one input stream of packets, as there is only one PHY in our LAN91C111. 2. There is one process which retrieves the packets from the LAN91C111. 3. A certain number, Κ, of buffers form a queue, are responsible for receiving packets. SMSC AN 10.12 Revision 1.2 (01-13-04)

The packet traffic at the input of the queue is characterized by the average number of packets per second, λ, at which they arrive at the queue. The device driver takes the packets out of the queue at an average rate of µ packets per second. Both of these processes are modeled by a probabilistic distribution called the Poisson distribution. To cut to the chase, the following expressions give, in the steady state, the probability p K, that a packet arrives at the input of the system and finds the queue full, at which the point the packet will be dropped by the LAN91C111. ( 1 a) K a p K = for λ µ K + 1 1 a 1 p K = for λ = µ K + 1 where a = λ/µ. Figure 1.2 Poisson Distribution The throughput is related to the probability that a packet arrives at the queue and finds the queue not full, or 1 - p K. This is proportional to the number of packets not dropped in a given period. Revision 1.2 (01-13-04) 2 SMSC AN 10.12

2 Performance as a Function of µ It is apparent that the performance characteristics of the system depends on several variables: how fast the system removes the packet from the LAN91C111, how many buffers are allocated in the LAN91C111 for receiving packets, and, less intuitively, how fast the packets are arriving the LAN91C111. The following figure depicts the system throughput, expressed as the ratio between the rate of packets not dropped by the system and the rate of total packets arriving, for the given scenario. This we define as the bandwidth utilization, which is the same as 1 - p K. 1.00 0.90 0.80 0.70 Utilization 0.60 0.50 0.40 0.30 λ = 100 λ = 1000 λ = 10000 λ = 50000 λ = 100000 0.20 0.10 0.00 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 µ Figure 2.1 Bandwidth Utilization for K=1 It easy to see that the faster the CPU is able to take packets out of the LAN91C111 buffers, the better the system is able to receive all or most of the input packets. It seems obvious, but one can never expect to receive more than µ packets per second, on the average. SMSC AN 10.12 3 Revision 1.2 (01-13-04)

3 Performance as a function of K The number of buffers allocated for receiving packets also affects the performance of the system. Figure 3.1 correlates the utilization to the number of receiving buffers for a particular packet arrival process characteristic. 1.00 0.90 0.80 0.70 Utilization 0.60 0.50 0.40 Κ=1 Κ=2 Κ=3 Κ=4 0.30 0.20 0.10 0.00 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 µ Figure 3.1 Bandwidth Utilization for λ = 10000 It can be seen that, given our mathematical model, increasing the number of buffers in the queue beyond 2 does not greatly affect the system throughput when the system s processing speed is high compared to the packet arrival rate. Revision 1.2 (01-13-04) 4 SMSC AN 10.12

4 System Performance Analysis Procedure Follow the steps below to analyze the system before attempting to optimize. Knowing where the bottleneck is gives you the most bang for the buck in terms of using your efforts wisely. 1. Are packets currently being discarded by the driver (or anywhere else in software) because the TCP/IP stack input buffer is full? If so, the application software or the operating system is too slow. Optimizing the driver will not help to increase the system performance. 2. Determine the number of buffers allocated for receiving in the LAN91C111. This is your K. 3. Attempt to measure how much time, in seconds, it takes to process one packet in the Interrupt Service Routine (ISR). This includes the following times: a. Interrupt latency b. Time required to copy the packet data from the LAN91C111 to main memory The reciprocal of this number is your µ. You will need to instrument your software to measure this over a large number of packets. µ is the average processing rate. 4. Plot the equation on the first page for 1 - p K, as a function of λ, with the Κ and µ values found from the two steps above. It should look something like one of the following curves. 1.00 0.90 0.80 Utilization 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 µ = 100 µ = 1000 µ = 10000 µ = 50000 µ = 100000 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 λ Figure 4.1 Bandwidth Utilization for K=3 5. Measure how quickly the packets arrive at your system. Over a period of time, calculate the average number of packets arriving per second. This is your actual λ. Since your system is dropping packets (or else, you might not care to do this analysis in the first place), you should not use your software to do the measurement. You will want to do this measurement by using a network sniffer or such similar equipment. SMSC AN 10.12 5 Revision 1.2 (01-13-04)

6. Measure the number of packets received by the software. Correlate λ to the chart from Step 3. Does it make sense in terms of how many packets are currently being dropped? 5 System Performance Optimization Recommendations Try the following techniques in software to increase the system throughput performance. Do them one at a time. Each time after modifying the software, reassess the new performance by repeating Step 6 described in the System Performance Analysis Procedure. 1. Allocate the right number of buffers, Κ, in the LAN91C111 for receiving. Larger is better, but take care to not negatively affect packet transmission performance to the point of not meeting system requirements. The effectiveness of varying this parameter depends on the burstiness of the packets arriving at the input of the system, which in turn is dependent on the network application and the instantaneous network condition. 2. Rewrite the ISR to make copying packet data from the LAN91C111 buffers to main memory more efficient. Consider rewriting it in assembly language if necessary. Copy all packets present in the buffers for each ISR invocation. There is no downside to taking this optimization action (except that assembly code is less readable, harder to maintain, and more difficult to port). 3. Increase the interrupt priority for packet receives to lower the interrupt latency. Shortening the interrupt latency lowers the buffer depth, Κ, requirement. Take care to prevent starving other (nonnetworking) processes of CPU cycles. If you still need more performance gain after taking the actions above, there are usually things that can be tried to modify the hardware design in order to increase µ.. However, it is difficult to make specific recommendations without seeing the particular designs. We heard stories that involve the LAN91C111 on a 16-bit bus with 20 wait states there is no software in the world that is going to fix that! As our rule of thumb, the raw bus bandwidth should be about twice the maximum data bandwidth through an I/O device. For the LAN91C111, that means 50 MBps, or 80 ns cycle time on a 32-bit bus if you want to do 100 Mbps full-duplex. 6 References Schwartz, Mischa, Computer-Communication Network Design and Analysis, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1977. Revision 1.2 (01-13-04) 6 SMSC AN 10.12

80 Arkay Drive Hauppauge, NY 11788 (631) 435-6000 FAX (631) 273-3123 Copyright SMSC 2004. All rights reserved. Circuit diagrams and other information relating to SMSC products are included as a means of illustrating typical applications. Consequently, complete information sufficient for construction purposes is not necessarily given. Although the information has been checked and is believed to be accurate, no responsibility is assumed for inaccuracies. SMSC reserves the right to make changes to specifications and product descriptions at any time without notice. Contact your local SMSC sales office to obtain the latest specifications before placing your product order. The provision of this information does not convey to the purchaser of the described semiconductor devices any licenses under any patent rights or other intellectual property rights of SMSC or others. All sales are expressly conditional on your agreement to the terms and conditions of the most recently dated version of SMSC's standard Terms of Sale Agreement dated before the date of your order (the "Terms of Sale Agreement"). The product may contain design defects or errors known as anomalies which may cause the product's functions to deviate from published specifications. Anomaly sheets are available upon request. SMSC products are not designed, intended, authorized or warranted for use in any life support or other application where product failure could cause or contribute to personal injury or severe property damage. Any and all such uses without prior written approval of an Officer of SMSC and further testing and/or modification will be fully at the risk of the customer. Copies of this document or other SMSC literature, as well as the Terms of Sale Agreement, may be obtained by visiting SMSC s website at http://www.smsc.com. SMSC is a registered trademark of Standard Microsystems Corporation ( SMSC ). Product names and company names are the trademarks of their respective holders. SMSC DISCLAIMS AND EXCLUDES ANY AND ALL WARRANTIES, INCLUDING WITHOUT LIMITATION ANY AND ALL IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND AGAINST INFRINGEMENT AND THE LIKE, AND ANY AND ALL WARRANTIES ARISING FROM ANY COURSE OF DEALING OR USAGE OF TRADE. IN NO EVENT SHALL SMSC BE LIABLE FOR ANY DIRECT, INCIDENTAL, INDIRECT, SPECIAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES; OR FOR LOST DATA, PROFITS, SAVINGS OR REVENUES OF ANY KIND; REGARDLESS OF THE FORM OF ACTION, WHETHER BASED ON CONTRACT; TORT; NEGLIGENCE OF SMSC OR OTHERS; STRICT LIABILITY; BREACH OF WARRANTY; OR OTHERWISE; WHETHER OR NOT ANY REMEDY OF BUYER IS HELD TO HAVE FAILED OF ITS ESSENTIAL PURPOSE, AND WHETHER OR NOT SMSC HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SMSC AN 10.12 7 Revision 1.2 (01-13-04)