An Efficient Implementation of Automatic Cloud Cover Assessment (ACCA) on a Reconfigurable Computer



Similar documents
Modified Line Search Method for Global Optimization

Domain 1: Designing a SQL Server Instance and a Database Solution

CHAPTER 3 THE TIME VALUE OF MONEY

Domain 1 - Describe Cisco VoIP Implementations

Systems Design Project: Indoor Location of Wireless Devices

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

(VCP-310)

ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC

FEATURE BASED RECOGNITION OF TRAFFIC VIDEO STREAMS FOR ONLINE ROUTE TRACING

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

ni.com/sdr Software Defined Radio

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Quantitative Computer Architecture

Study on the application of the software phase-locked loop in tracking and filtering of pulse signal

1 Computing the Standard Deviation of Sample Means

iprox sensors iprox inductive sensors iprox programming tools ProxView programming software iprox the world s most versatile proximity sensor

Soving Recurrence Relations

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

International Journal on Emerging Technologies 1(2): 48-56(2010) ISSN :

Domain 1 Components of the Cisco Unified Communications Architecture

Unicenter TCPaccess FTP Server

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

Lesson 15 ANOVA (analysis of variance)

Spam Detection. A Bayesian approach to filtering spam

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

A Secure Implementation of Java Inner Classes

PSYCHOLOGICAL STATISTICS

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Recovery time guaranteed heuristic routing for improving computation complexity in survivable WDM networks

Subject CT5 Contingencies Core Technical Syllabus

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

1. Introduction. Scheduling Theory

Output Analysis (2, Chapters 10 &11 Law)

CS100: Introduction to Computer Science

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Chapter 10 Computer Design Basics

Engineering Data Management

Confidence Intervals for One Mean

COMPUTING EFFICIENCY METRICS FOR SYNERGIC INTELLIGENT TRANSPORTATION SYSTEMS

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

SPE Copyright 2001, Society of Petroleum Engineers Inc.

5 Boolean Decision Trees (February 11)

Convention Paper 6764

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System

Security Functions and Purposes of Network Devices and Technologies (SY0-301) Firewalls. Audiobooks

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Multiplexers and Demultiplexers

Measures of Spread and Boxplots Discrete Math, Section 9.4

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction


Design and Evaluation of Parallel String Matching Algorithms for Network Intrusion Detection Systems

LECTURE 13: Cross-validation

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Basic Elements of Arithmetic Sequences and Series

Research Article Sign Data Derivative Recovery

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

Determining the sample size

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

Incremental calculation of weighted mean and variance

Malicious Node Detection in Wireless Sensor Networks using Weighted Trust Evaluation

DDoS attacks defence strategies based on nonparametric CUSUM algorithm

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

Effective Data Deduplication Implementation

QUADRO tech. FSA Migrator 2.6. File Server Migrations - Made Easy

On the Capacity of Hybrid Wireless Networks

Optimal Adaptive Bandwidth Monitoring for QoS Based Retrieval

MTO-MTS Production Systems in Supply Chains

Evaluating Model for B2C E- commerce Enterprise Development Based on DEA

I. Chi-squared Distributions

RUT - development handbook 1.3 The Spiral Model v 4.0

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

1 Correlation and Regression Analysis

A Guide to the Pricing Conventions of SFE Interest Rate Products

Is there employment discrimination against the disabled? Melanie K Jones i. University of Wales, Swansea

Software Engineering Guest Lecture, University of Toronto

Baan Service Master Data Management

Chapter 7 Methods of Finding Estimators

FPGA ACM/SIGDA Fourth International Symposium on FPGAs February 11-13, 1996, Monterey, CA

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Shared Memory with Caching

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Reliability Analysis in HPC clusters

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

June 3, Voice over IP

W. Sandmann, O. Bober University of Bamberg, Germany

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

A Distributed Dynamic Load Balancer for Iterative Applications

How To Improve Software Reliability

SAR Assessment and Analysis of Cumulative Body Exposure to Multi Transmitters from a Mobile Phone

Managing and Implementing the Data Mining Process Using a Truly Stepwise Approach

Faulty Clock Detection for Crypto Circuits Against Differential Fault Analysis Attack

CHAPTER 3 The Simple Surface Area Measurement Module

Authentication - Access Control Default Security Active Directory Trusted Authentication Guest User or Anonymous (un-authenticated) Logging Out

Baan Finance Accounts Payable

Exploratory Data Analysis

Transcription:

A Efficiet Implemetatio of Automatic Cloud Cover Assessmet (ACCA) o a Recofigurable Computer Esam El-Araby Mohamed Taher Tarek El-Ghazawi Jacquelie Le Moige Washigto Uiversity Washigto Uiversity Washigto Uiversity NASA Goddard Space Flight Ceter (GSFC) Washigto, DC 237 Washigto, DC 237 Washigto, DC 237 Greebelt, MD 2771 esam@gwu.edu mtaher@gwu.edu tarek@gwu.edu lemoige@backserv.gsfc.asa.gov Abstract Clouds have a critical role i may studies, e.g. weather- ad climate-related studies. However, they represet a source of errors i may applicatios, ad the presece of cloud cotamiatio ca hider the use of satellite data. This requires a cloud detectio process to mask out cloudy pixels from further processig. The tred for remote sesig satellite missios has always bee towards smaller size, lower cost, more flexibility, ad higher computatioal power. Recofigurable Computers (RCs) combie the flexibility of traditioal microprocessors with the power of Field Programmable Gate Arrays (FPGAs). Therefore, RCs are a promisig cadidate for o-board preprocessig. This paper presets the desig ad implemetatio of a RC-based real-time cloud detectio system. We ivestigate the potetial of usig RCs for o-board preprocessig by prototypig the Ladsat 7 ETM+ ACCA algorithm o oe of the state-of-the art recofigurable platforms, SRC-6E. Although a reasoable amout of ivestigatios of the ACCA cloud detectio algorithm usig FPGAs has bee reported i the literature, very few details/results were provided ad/or limited cotributios were accomplished. Our work has bee prove to provide higher performace ad higher detectio accuracy. I. INTRODUCTION The tred for remote sesig satellite missios has always bee towards smaller size, lower cost, ad more flexibility. O-board processig, as a solutio, permits a good utilizatio of expesive resources. Istead of storig ad forwardig all captured images, data processig ca be performed o-orbit prior to dowlik resultig i the reductio of commuicatio badwidth as well as i simpler ad faster subsequet computatios to be performed o groud statios. Cosequetly, o-board processig ca reduce the cost ad the complexity of the O-The-Groud/Earth processig systems. Furthermore, it eables autoomous decisios to be take o-board which ca potetially reduce the delay betwee image capture, aalysis ad actio. This leads to faster critical decisios which are crucial for future recofigurable web sesors missios as well as plaetary exploratio missios. The presece of cloud cotamiatio ca hider the use of satellite data, ad this requires a cloud detectio process to mask out cloudy pixels from further processig. The Ladsat 7 ETM+ (Ehaced Thematic Mapper) ACCA (Automatic Cloud Cover Assessmet) algorithm [1] is a compromise betwee the simplicity of earlier Ladsat algorithms, e.g. ACCA for Ladsat 4 ad 5, ad the complexity of later approaches such as the MODIS (Moderate Resolutio Imagig Spectroradiometer) cloud mask. Recofigurable Computers (RCs) combie the flexibility of traditioal microprocessors with the power of Field Programmable Gate Arrays (FPGAs). These platforms have always bee reported to outperform the covetioal platforms i terms of throughput ad processig power withi the domai of cryptography, ad image processig applicatios [2]. I additio, they are characterized by lower form/wrap factors compared to parallel platforms, ad higher flexibility tha ASIC solutios. Therefore, RCs are a promisig cadidate for o-board preprocessig. The SRC-6E Recofigurable Computer is oe example of this category of hybrid computers [3] ad is used here for this purpose. This paper presets the desig ad implemetatio of a RC-based real-time cloud detectio system. We ivestigate the potetial of usig RCs for o-board preprocessig by prototypig the Ladsat 7 ETM+ ACCA algorithm o oe of the state-of-the art recofigurable platforms, SRC-6E. Although a reasoable amout of ivestigatios of the ACCA cloud detectio algorithm usig FPGAs has bee reported i the literature [4], [5], very few details/results were provided ad/or limited cotributios were accomplished. Our work is uique i providig higher performace ad higher detectio accuracy. II. DESCRIPTION OF THE AUTOMATIC CLOUD COVER ASSESSMENT (ACCA) ALGORITHM

Theory of Ladsat 7 ETM+ ACCA algorithm is based o the observatio that clouds are highly reflective ad cold. The high reflectivity ca be detected i the visible, ear- ad mid- IR bads. The thermal properties of clouds ca be detected i thermal IR bad. Table I presets the bads wavelegths ad its detectio features. TABLE I LANDSAT 7 ETM+ BANDS 95 th percetile, i.e. the smallest umber that is greater tha 95% of the umbers i the give set of pixels, becomes the ew thermal threshold for Pass Two. 1 η = xi i = 1 2 1 2 σ = ( xi η) 1 i = 1 3 (1) 1 xi η Skewess = i = 1 σ 4 1 xi η Kurtosis = 3 i = 1 σ TABLE III GENERALIZED CLASSIFICATION RULES FOR PASS-ONE [5] The Ladsat 7 ETM+ ACCA algorithm recogizes clouds by aalyzig the scee twice. I the first pass eight filters are utilized for this purpose, see Table II. TABLE II PASS-ONE FILTERS Omissio errors are expected. The goal of pass-oe is to develop a reliable cloud sigature for use i passtwo where the remaiig clouds are idetified. Commissio errors, however, create algorithm failure ad must be miimized. Three categories result from pass-oe: clouds, o-clouds, ad a ambiguous group that are revisited i pass-two. Joh A. Williams et al., [5], [6], have used bad mappig techiques to implemet Ladsat-based algorithms o MODIS data. The geeralized ad modified classificatio rules for Pass-Oe are show i Table III. Pass-Two resolves the detectio ambiguity resulted from Pass-Oe. Thermal properties of clouds idetified durig Pass-Oe are characterized ad used to idetify remaiig cloud pixels. Bad 6 statistical momets (mea, stadard deviatio, distributio skewess, kurtosis), see the equatio (1), are computed ad ew adaptive thresholds are determied accordigly. The Image pixels that fall below the ew thermal threshold ad survive the first three Pass-Oe filters are classified as cloud pixels. Specifically, the followig three coditios must be satisfied: Desert idex (Filter 7) is greater tha.5 Colder cloud populatio exceeds.4 percet of the scee Mea temperature of the cloud class is less tha 295K Durig processig, a cloud mask is created. The fial step is processig the cloud mask for holes. After the two ACCA passes, a filter is applied to the cloud mask to fill i cloud holes. This filterig operatio works by examiig each o-cloud pixel i the mask. If 5 out of the 8 eighbors are clouds the the pixel is reclassified as cloud. Cloud cover results from both Pass-Oe ad Pass-Two are compared. Extreme differeces are idicative of cloud sigature corruptio. Whe this occurs, Pass-Two results are igored ad all results are take from Pass-Oe. The fial cloud cover percetage for the image is calculated based o the filtered cloud mask. The cloud pixels i the mask are tabulated ad a cloud cover percetage score for the scee is computed. III. SRC-6E RECONFIGURABLE COMPUTER A. Hardware Architecture SRC-6E platform cosists of two geeral-purpose microprocessor boards ad oe MAP recofigurable

processor board. Each microprocessor board is based o two 1GHz (3GHz) Petium 3 (Petium 4) microprocessors. The SRC-6E MAP board cosists of two MAP recofigurable processors. Overall, the SRC- 6E system provides a 1:1 microprocessor to FPGA ratio. Microprocessors boards are coected to the MAP board through the SNAP itercoect. The SNAP card plugs ito the DIMM slot o the microprocessor motherboard [3] to provide higher data trasfer rates betwee the boards tha the iefficiet but commo PCI solutio. The peak trasfer rate betwee a microprocessor board ad the MAP board is 16 MB/sec for the Petium 4 versio of SRC-6E. Fig. 1 Hardware Architecture of SRC-6E Hardware architecture of the SRC-6E MAP processor is show i Fig. 1. The MAP board is composed of two idetical MAP uits. Each MAP uit has oe cotrol FPGA ad two user FPGAs, all Xilix Virtex II-6-4. Additioally, each MAP uit cotais six iterleaved baks of the o-board memory (OBM) with a total capacity of 24 MB. The maximum aggregate data trasfer rate amog all FPGAs ad oboard memory is 48 MB/s. The user FPGAs are cofigured i such a way that oe is i the master mode ad the other is i the slave mode. The two FPGAs of a MAP are directly coected usig a bridge port. Furthermore, MAP processors ca be chaied together usig a chai port to create a array of FPGAs I the typical mode of operatio, iput data is first trasferred through the Cotrol FPGA from the microprocessor memory to OBM. This trasfer is followed by computatios performed by the User FPGA, which fetches iput data from OBM ad trasfers results back to OBM. Fially, the results are trasmitted back from OBM to microprocessor memory. B. Programmig Model The SRC-6E has a somewhat similar compilatio process as a covetioal microprocessor-based computig system, but also produces logic for the MAP recofigurable processor, see Fig. 2. Fig. 2 SRC Compilatio Process There are two types of applicatio source files to be compiled. Source files of the first type are compiled with the Itel processor as the target executio platform. Source files of the secod type are compiled with executio o the MAP recofigurable processor as a target. A file that cotais a program to be executed o the Itel processor is compiled usig the traditioal microprocessor compiler. All files cotaiig fuctios that call hardware macros ad thus execute o the MAP are compiled by the MAP compiler. MAP source files cotai MAP fuctios are maily composed of macro calls. Here, a macro is defied as a piece of hardware logic desiged to implemet a certai fuctio. Sice users ofte wish to exted the built-i set of operators, the compiler allows users to itegrate their ow VHDL/Verilog macros. IV. IMPLEMENTATION DETAILS AND EXPERIMENTAL RESULTS The ACCA algorithm adapted for Ladsat 7 ETM+ data has bee implemeted both i C ad Matlab, ad pass-oe has bee implemeted ad sythesized for the Xilix XC2V6 FPGA o SRC-6E. Fig 3. shows the implemetatio architecture of pass-oe of the algorithm. This module reads the five bad-image ad feeds them to six filters geeratig the output mask. The fuctio of each mask is listed i Table II. Our goal for the implemetatio was to achieve comparable performace ad detectio accuracy to what has bee reported i [4] ad [5]. Therefore, the oly costrait to our desig was the processig speed, as measured by throughput. This costrait is approached through full-pipeliig of the desig. May of the tests i pass-oe are threshold tests of ratio values, such as the sow test. We foud out that it was more efficiet, i terms of the required resources, to multiply oe value by the threshold, ad compare with the other value, istead of performig the divisio the comparig agaist the threshold.

Fig. 3 Pass-Oe Module Fig. 4 Detectio Accuracy The desig was developed i VHDL, sythesized, placed ad routed, ad was foud to occupy approximately 7% of the available logic resources o the chip. This eabled us to istatiate eight cocurret processig egies of the desig i the same chip, which icreased the performace to eight folds of what we expected. The total resources utilizatio for the eight egie versio was approximately 57%. This leaves plety of room for more egies ad/or processig fuctios to be implemeted o the same chip. The maximum operatioal clock speed of the desig is 1MHz which resulted i 4 Megapixels/sec (5 iputs x 8 egies x 1MHz) as data iput/cosumptio rate. Furthermore, the data output/productio rate was 8 Megapixels/sec (1 output x 8 egies x 1MHz). Fig. 4 shows the results obtaied from a 2.8GHz Itel Xeo processor ad from SRC-6E. The hardware (fixed-poit) implemetatio provided a very high performace (28 times faster) compared to the 2.8GHz Xeo implemetatio. We have obtaied approximately a ideal detectio accuracy (% detectio error) as compared to software floatig-poit referece results. These results were achieved by performig the iteral algorithm filterig with full-precisio (23-bit) fixedpoit arithmetic.

M Pixel / Sec m sec 1 8 6 4 2 2 15 1 5 6. 5. 4. 3. 2. 1. 4 35 3 25 2 15 1 5. 5 1 2 8 Queeslad GWU 1X GWU 2X GWU 8X 1 (a) Throughput 2 Queeslad GWU 1X GWU 2X GWU 8X (b) Speedup Fig. 5 Hardware-to-Hardware Performace 43.2 12.7 4 6.5 Xeo 2.8GHz 1X 2X 8X 1. (a) Executio Time 3.6 Xeo 2.8GHz 1X 2X 8X (b) Speedup Fig. 6 Hardware-to-Software Performace Our results have bee compared to previous work by Williams et al. [5]. Fig. 5 shows those comparisos. Three hardware implemetatios have bee compared: 1X (oe egie is istatiated), 2X (2 egies are istatiated), ad 8X (8 egies are istatiated). Our implemetatios achieve a speedup of up to 16 times compared to those reported i [5]. The superiority of RCs over traditioal platforms for cloud detectio is demostrated through the performace plots show i Fig. 6. Our implemetatios achieve a speedup of up to 28.3 compared to 2.8GHz Itel Xeo processor. 7.1 16 1.52 28.3 V. CONCLUSIONS This paper preseted the desig ad implemetatio of a RC-based real-time cloud detectio system. We ivestigated the potetial of usig RCs for o-board preprocessig by prototypig the Ladsat 7 ETM+ ACCA algorithm o oe of the state-of-the art recofigurable platforms, SRC-6E. Our work has bee prove to provide higher performace ad higher detectio accuracy tha previously reported results. The higher performace is achieved through full-pipeliig ad superscalig (up to 8 cocurret egies), ad thus achievig 4 Megapixels/sec as a data cosumptio rate ad 8 Megapixel/sec as a data productio rate. I additio, the performace has bee compared to similar hardware implemetatios ad proved to achieve as high as 16 folds speedup. The speedup compared to a 2.8GHz Xeo implemetatio has bee 28 folds higher. O the other had, the detectio accuracy has bee verified agaist software floatig-poit referece implemetatio, ad the results revealed idetical results. REFERENCES [1] R.R. Irish, Ladsat 7 Automatic Cloud Cover Assessmet, Algorithms for Multispectral, Hyperspectral ad Ultraspectral Imagery VI, SPIE, Orlado, FL.,, 24-26 April 2, pp.348-355. [2] E. El-Araby, T. El-Ghazawi, J. Le Moige, ad K. Gaj, Wavelet Spectral Dimesio Reductio of Hyperspectral Imagery o a Recofigurable Computer, IEEE Iteratioal Coferece o Field-Programmable Techology, FPT 24, Brisbae, Australia, December 24. [3] SRC-6E C-Programmig Eviromet Guide, SRC Computers, Ic. 24. [4] R.S. Basso, J. Le Moige, S. Veuella, ad R.R. Irish, FPGA Implemetatio for O-Board Cloud Detectio, Iteratioal Geosciece ad Remote Sesig Symposium. Hawaii, 2-24 July 2. [5] J.A. Williams, A.S. Dawood, S.J. Visser, FPGA-based Cloud Detectio for Real-Time Oboard Remote Sesig, Proceedigs of IEEE Iteratioal Coferece o Field-Programmable Techology (FPT 22), 16-18 Dec. 22, pp.11 116. [6] J.A. Williams, A.S. Dawood, S.J. Visser, Real-Time Wildfire ad Volcaic Plume Detectio from Spacebore Platforms with Recofigurable Logic, 11 th Australasia Remote Sesig ad Photogrammetry Coferece, Brisbae, Australia, 2-6 September 22.