Application Note: 202

Similar documents
FOCUS Service Management Software Version 8.5 for CounterPoint Installation Instructions

FOCUS Service Management Software Version 8.5 for Passport Business Solutions Installation Instructions

How To Install Fcus Service Management Software On A Pc Or Macbook

Helpdesk Support Tickets & Knowledgebase

User s Guide. Moduc Elements MagiCAD plugin

Using PayPal Website Payments Pro UK with ProductCart

Access to the Ashworth College Online Library service is free and provided upon enrollment. To access ProQuest:

Implementing SQL Manage Quick Guide

Preparing to Deploy Reflection : A Guide for System Administrators. Version 14.1

Durango Merchant Services QuickBooks SyncPay

Avatier Identity Management Suite

TaskCentre v4.5 Send Message (SMTP) Tool White Paper

Remote Desktop Tutorial. By: Virginia Ginny Morris

Ten Steps for an Easy Install of the eg Enterprise Suite

User Guide. Excel Data Management Pack (EDM-Pack) OnCommand Workflow Automation (WFA) Abstract PROFESSIONAL SERVICES. Date: December 2015

The ad hoc reporting feature provides a user the ability to generate reports on many of the data items contained in the categories.

Access EEC s Web Applications... 2 View Messages from EEC... 3 Sign In as a Returning User... 3

CSAT Account Management

Telelink 6. Installation Manual

Improved Data Center Power Consumption and Streamlining Management in Windows Server 2008 R2 with SP1

TaskCentre v4.5 File Management Tool White Paper

Creating automated reports using VBS AN 44

ISAM TO SQL MIGRATION IN SYSPRO

GUARD1 /plus. PIPE Utility. User's Manual. Version 2.0

Traffic monitoring on ProCurve switches with sflow and InMon Traffic Sentinel

User Manual Brainloop Outlook Add-In. Version 3.4

KronoDesk Migration and Integration Guide Inflectra Corporation

Dreamweaver MX Templates

CenterPoint Accounting for Agriculture Network (Domain) Installation Instructions

HarePoint HelpDesk for SharePoint. For SharePoint Server 2010, SharePoint Foundation User Guide

Licensing Windows Server 2012 for use with virtualization technologies

UNIVERSITY OF CALIFORNIA MERCED PERFORMANCE MANAGEMENT GUIDELINES

Custom Portlets. an unbiased review of the greatest Practice CS feature ever. Andrew V. Gamet

Getting started with Android

Tipsheet: Sending Out Mass s in ApplyYourself

Outlook Plug-In. Send Conference Invites from Outlook. Downloading Outlook Plug-In CONFERENCING & COLLABORATION RESERVATIONLESS-PLUS

Exchanging Files Securely with Gerstco Using gpg4win Public Key Encryption

DTU Data Transfer Utilities Software User manual

Serv-U Distributed Architecture Guide

NETWRIX CHANGE NOTIFIER

learndirect Test Information Guide The National Test in Adult Numeracy

How do I clear my web browser's cache, cookies, and history?

A COMPLETE GUIDE TO ORACLE BI DISCOVERER END USER LAYER (EUL)

Aras Innovator Internet Explorer Client Configuration

In this lab class we will approach the following topics:

1) Update the AccuBuild Program to the latest version Version or later.

HP ExpertOne. HP2-T21: Administering HP Server Solutions. Table of Contents

Basic Guide line for The Sportident system

Release Notes. Intellex 4.3 Patch 7 update Network Client Patch 7 update. Applicable Software

Copyright 2013, SafeNet, Inc. All rights reserved. We have attempted to make these documents complete, accurate, and

Getting Started Guide

BRILL s Editorial Manager (EM) Manual for Authors Table of Contents

ATL: Atlas Transformation Language. ATL Installation Guide

Using Identity Finder. ITS Training Document

Aras Innovator Internet Explorer Client Configuration

Software Distribution

TheBrain 9 New Features and Benefits Overview

Licensing Windows Server 2012 R2 for use with virtualization technologies

Connecting to

990 e-postcard FAQ. Is there a charge to file form 990-N (e-postcard)? No, the e-postcard system is completely free.

Android specific properties

CSC IT practix Recommendations

Deployment Overview (Installation):

How To Set Up A General Ledger In Korea

Exercise 5 Server Configuration, Web and FTP Instructions and preparatory questions Administration of Computer Systems, Fall 2008

edoc Lite Recruitment Guidelines

TaskCentre v4.5 SMTP Tool White Paper

GETTING STARTED With the Control Panel Table of Contents

Cost Allocation Methodologies

Using PayPal Website Payments Pro with ProductCart

UTO Training Bb Discussion Boards. Technical Assistance: Website: Help Desk Phone: (24/7 support) Instruction

BackupAssist SQL Add-on

MiaRec. Performance Monitoring. Revision 1.1 ( )

5.2.1 Passwords. Information Technology Policy. Policy. Purpose. Policy Statement. Applicability of this Policy

Aladdin HASP SRM Key Problem Resolution

Steps to fix the product is not properly fixed issue for international clients.

IT Quick Reference Guides Using Outlook 2011 for Mac for Faculty and Staff

How to Reduce Project Lead Times Through Improved Scheduling

April 3, Release Notes

Knowledge Base Article

Manual. Adapter OBD v2. Software version: NEVO DiegoG Full compatibility with OBD Adapter v2 2.0B

TaskCentre v4.5 MS SQL Server Trigger Tool White Paper

The Relativity Appliance Installation Guide

MaaS360 Cloud Extender

Sitecore Serialization Guide

Interaction Manager OFT 605 (Part1)

Disk Redundancy (RAID)

Topic: Import MS Excel data into MS Project Tips & Troubleshooting

SBClient and Microsoft Windows Terminal Server (Including Citrix Server)

A Beginner s Guide to Building Virtual Web Servers

User Guide Version 3.9

Excel Contact Reports

WEB APPLICATION SECURITY TESTING

DET Video Conference Network. Polycom. VSX Series 7000

Readme File. Purpose. What is Translation Manager 9.3.1? Hyperion Translation Manager Release Readme

FieldManager Read-Only User Guide (for Contractors) - NDOT Supplemental

Use the CV module within Pure to create several CVs, each targeted towards a different objective (e.g. a specific project or funding application).

Wireless Light-Level Monitoring

Product Documentation. New Features Guide. Version 9.7.5/XE6

Transcription:

Applicatin Nte: 202 MDK-ARM Cmpiler Optimizatins Getting the Best Optimized Cde fr yur Embedded Applicatin Abstract This dcument examines the ARM Cmpilatin Tls, as used inside the Keil MDK-ARM (Micrcntrller Develpment Kit), and hw t use them t ptimize yur cde fr best perfrmance r smallest cde-size. Cntents ARM Cmpilatin Tls... 2 Cmpiler Optins fr Embedded Applicatins... 2 Optimizing fr Smallest Cde Size... 5 Cmpile the Measure example withut any ptimizatins... 5 Optimize the Measure example fr Size... 6 Optimizing fr Best Perfrmance... 7 Run the Dhrystne benchmark withut any ptimizatins... 7 Optimize the Dhrystne example fr Perfrmance... 8 Summary... 9 Revisin Histry August 2009: Initial Versin Infrmatin in this file, the accmpany manuals, and sftware is Cpyright (c) ARM. All rights reserved.

ARM Cmpilatin Tls The ARM Cmpilatin Tls are the nly cmpilatin tls c-develped with the ARM prcessrs, and specifically designed t ptimally supprt the ARM architecture. They are a result f 20 years f develpment, and are recgnized as the industry-leading C and C++ cmpilatin tls fr the ARM, Thumb, and Thumb-2 instructins sets. The ARM Cmpilatin tls cnsist f: The ARM Cmpiler, which enables yu t cmpile C and C++ cde. It is an ptimizing cmpiler, and features cmmand-line ptins t enable yu t cntrl the level f ptimizatin Linker and Utilities, which assign addresses and lay ut sectins f cde t frm a final image A selectin f libraries, including the ISO standard C libraries, and the MicrLIB C library which is ptimized fr embedded applicatins Assembler, which generates machine cde instructins frm ARM, Thumb r Thumb-2 assembly-level surce cde Cmpiler Optins fr Embedded Applicatins The ARM Cmpilatin Tls include a number f cmpiler ptimizatins t help yu best target yur cde fr yur chsen micrcntrller device and applicatin area. They can be accessed frm within µvisin by clicking n Prject Optins fr Target. The ptins described this dcument can be fund n the Target and C/C++ tabs f the Optins fr Targets dialg. 2

Crss-Mdule Optimizatin takes infrmatin frm a prir build and uses it t place UNUSED functins int their wn ELF sectin in the crrespnding bject file. This ptin is als knwn as Linker Feedback, and requires yu t build yur applicatin twice t take advantage f it fr reduced cde size. Crss-Mdule Optimizatin has been shwn t reduce cde size, by remving unused functins frm yur applicatin. It can als imprve the perfrmance f yur applicatin, by allwing mdules t share inline cde. The MicrLIB C library has been ptimized t reduce the size f embedded applicatins. It is a subset f the ISO standard C runtime library, and ffers a tradeff between functinality and cde size. Sme f the standard C library functins such as memcpy() are slwer, while sme features f the default library are nt supprted. Unsupprted features include: Operating system functins e.g. abrt(), exit(), time(), system(), getenv(), Wide character and multi-byte supprt e.g. mbtwc(), wctmb() The stdi file I/O functin, with the exceptin f stdin, stdut and stderr Psitin-independent and thread-safe cde Use the MicrLIB C library fr applicatins where verall perfrmance can be traded ff against the need t reduce cde size and memry cst. Link-Time Cde Generatin instructs the cmpiler t create bjects in an intermediate frmat s that the linker can perfrm further cde ptimizatins. This gives the cde generatr visibility int crss-file dependencies f all bjects simultaneusly, allwing it t apply a higher level f ptimizatins. Link-time cde generatin can reduce cde size, and allw yur applicatin t run faster. Optimizatin Levels can als be adjusted. The different levels f ptimizatin allw yu t trade ff between the level f debug infrmatin available in the cmpiled cde, and the perfrmance f the cde. The fllwing ptimizatin levels are available: -O0 applies minimum ptimizatins. Mst ptimizatins are switched ff, and the cde generated has the best debug view. -O1 applies restricted ptimizatin. Fr example, unused inline functins and unused static functins are remved. At this level f ptimizatin, the cmpiler als applies autmatic ptimizatins such as remving redundant cde and re-rdering instructins s as t avid an interlck situatin. The cde generated is reasnably ptimized, with a gd debug view. -O2 applies high ptimizatin (This is the default setting). Optimizatins applied at this level take advantage f ARM s in-depth knwledge f the prcessr architecture, t explit prcessr-specific behavir f the given target. It generates well ptimized cde, but with limited debug view. -O3 applies the mst aggressive ptimizatin. The ptimizatin is in accrdance with the user s Ospace/-Otime chice. By default, multi-file cmpilatin is enabled, which leads t a lnger cmpile time, but gives the highest levels f ptimizatin. 3

The Optimize fr Time checkbx causes the cmpiler t ptimize with a greater fcus n achieving the best perfrmance when checked (-Otime) r the smallest cde size when unchecked (-Ospace). Unchecking Optimize fr Time selects the Ospace ptin which instructs the cmpiler t perfrm ptimizatins t reduce the image size at the expense f a pssible increase in executin time. Fr example, using ut-f-line functin calls instead f inline cde fr large structure cpies. This is the default ptin. When running the cmpiler frm the cmmand line, this ptin is invked using -Ospace Checking Optimize fr Time selects the Otime ptin which instructs the cmpiler t ptimize the cde fr the fastest executin time, at the risk f an increase in the image size. It is recmmended that yu cmpile the time-critical parts f yur cde with Otime, and the rest using the Ospace directive. Split Lad and Stre Multiples instructs the cmpiler t split LDM and STM instructins invlving a large number f registers int a series f lads/stres f fewer multiple registers. This means that an LDM f 16 registers can be split int 4 separate LDMs f 4 registers each. This ptin helps t reduce the interrupt latency n ARM systems which d nt have a cache r write buffer, and systems which use zer-wait state 32-bit memry. Fr example, the ARM7 and ARM9 prcessrs take can nly take an exceptin n an instructin bundary. If an exceptin ccurs at the start f an LDM f 16 registers in a cacheless ARM7/ARM9 system, the system will finish making 16 accesses t memry befre taking the exceptin. Depending n the memry arbitratin system, this can result in a very high interrupt latency. Breaking the LDM int 4 individual LDMs fr 4 registers means that the prcessr will take the exceptin after lading a maximum f 4 registers, thereby greatly reducing the interrupt latency. Selecting this ptin imprves the verall perfrmance f the system. The One ELF Sectin per Functin ptin tells the cmpiler t put all functins int their wn individual ELF sectins. This allws the linker t remve unused functins. An ELF cde sectin typically cntains the cde fr a number f functins. The linker is nrmally nly able t remve unused ELF sectins, nt unused functins. An ELF sectin can nly be remved if all its cntents are unused. Therefre, splitting each functin int its wn ELF sectin allws the cmpiler t easily identify which nes are unused, and remve them. Selecting this ptin increases the time required t cmpile yur cde, but results in imprved perfrmance. The cmbinatin f ptins applied will depend n yur ptimizatin gal whether yu are ptimizing fr smallest cde size, r best perfrmance. The next sectin illustrates the best ptimizatin ptins fr each f these gals. 4

Optimizing fr Smallest Cde Size T ptimize yur cde fr the smallest size, the best ptins t apply are: The MicrLIB C library Crss-mdule ptimizatin Optimizatin level 2 (-O2) Cmpile the Measure example withut any ptimizatins The Measure example uses analg and digital inputs t simulate a data lgger. File -- Open Prject C:\Keil\ARM\Bards\Keil\MCBSTM32\Measure\Measure.uv2 Click the Optins fr Target buttn In the Target tab: Uncheck Crss-Mdule Optimizatin Uncheck Use MicrLIB Uncheck Use Link-Time Cde Generatin In the C/C++ tab: Set Optimizatin Level t Zer Then click OK t save yur changes. Prject Build target Withut any cmpiler ptimizatins applied, the initial cde size is 13,656 Bytes. 5

Optimize the Measure example fr Size Apply the cmpiler ptimizatins in turn, and re-cmpile each time t see their effect in reducing the cde size fr the example. Optins fr Target Target tab: Use the MicrLIB C library Optins fr Target Target tab: Use crss-mdule ptimizatin - Remember t cmpile twice Optins fr Target C/C++ tab: Enable Optimizatin level 2 (-O2) Optimizatin Applied Cmpile Size Size Reductin Imprvement MicrLIB C library 8,960 Bytes 4,696 Bytes 34% smaller Crss-Mdule Cmpilatin 13,500 Bytes 156 Bytes 1.1% smaller Optimizatin level O2 12,936 Bytes 720 Bytes 5.3% smaller All 3 ptimizatin ptins 8,116 Bytes 5,540 Bytes 40.6% smaller Applying all the ptimizatins will reduce the cde size dwn t 8,116 Bytes. The fully ptimized cde is 5,540 Bytes smaller, a ttal cde size reductin f 40.6% 6

Optimizing fr Best Perfrmance T ptimize yur cde fr perfrmance, the best ptins t apply are: Crss-mdule ptimizatin Optimizatin level 3 (-O3) Optimize fr time Run the Dhrystne benchmark withut any ptimizatins The Dhrystne benchmark is used t measure and cmpare the perfrmance f different cmputers, r the efficiency f the cde generated fr the same cmputer by different cmpilers. File Open Prject C:\Keil\ARM\Examples\DHRY\DHRY.uv2 Click the Optins fr Target buttn Turn ff ptimizatin settings in the Target and C/C++ tabs, then click OK Prject Build target Enter Debug mde View Serial Windws UART #1 Open the UART #1 windw View Analysis Windws Perfrmance Analyzer Open the Perfrmance Analyzer Debug Run Start running the applicatin When prmpted: Enter 50000 in the UART#1 windw and press Enter 7

In the Perfrmance Analyzer windw, nte that The drhy_1 lp tk 2.829s The dhry_2 tk 2.014s In the UART #1 windw, nte that It tk 138.0 ms fr 1 run thrugh Dhrystne The applicatin is executing 7246.4 Dhrystnes per secnd Optimize the Dhrystne example fr Perfrmance Re-cmpile the example with all three f the fllwing ptimizatins applied: Optins fr Target Target tab: Crss-mdule ptimizatin Remember t cmpile twice Optins fr Target C/C++ tab: Optimizatin level 3 (-O3) Optins fr Target C/C++ tab: Optimize fr Time Re-run the applicatin, and examine the perfrmance. Measurement Withut ptimizatins With Optimizatins Imprvement dhry_1 2.829s 1.695s 40.1% faster dhry_2 2.014s 1.011s 49.8% faster Micrsecnds fr 1 run 138.0 70 49.3% faster thrugh Dhrystne Dhrystnes per secnd 7246.4 14,285.7 97.1% mre The fully ptimized cde achieves apprximately 2x the perfrmance f the un-ptimized cde. 8

Summary The ARM Cmpilatin Tls ffer a range f ptins t apply when cmpiling yur cde. These ptins can be cmbined t ptimize yur cde fr best perfrmance, fr smallest cde size, r fr any perfrmance pint between these tw extremes, t best suit yur targeted micrcntrller device and market. When ptimizing yur cde, MDK-ARM makes it easy and cnvenient t measure the effect f the different ptimizatin settings n yur applicatin. The cde size is clearly displayed after cmpilatin, and a range f analysis tls such as the Perfrmance Analyzer enable yu t measure perfrmance. The ptimizatin ptins in the ARM Cmpilatin Tls, tgether with the easy-t-use analysis tls in MDK-ARM, help yu t easily ptimize yur applicatin t meet yur specific requirements. 9