16-Tap, 8-Bit FIR Filter Applications Guide

Similar documents
FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture.

AN3998 Application note

FPGAs for High-Performance DSP Applications

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

Let s put together a Manual Processor

APPLICATION NOTE. Efficient Shift Registers, LFSR Counters, and Long Pseudo- Random Sequence Generators. Introduction.

The string of digits in the binary number system represents the quantity

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers

Application Note: AN00141 xcore-xa - Application Development

Multipliers. Introduction

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

BINARY CODED DECIMAL: B.C.D.

Optimising the resource utilisation in high-speed network intrusion detection systems.

(Refer Slide Time: 00:01:16 min)

HEF4021B. 1. General description. 2. Features and benefits. 3. Ordering information. 8-bit static shift register

Asynchronous counters, except for the first block, work independently from a system clock.

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

AN11008 Flash based non-volatile storage

The 104 Duke_ACC Machine

Lab 1: Introduction to Xilinx ISE Tutorial

Module 3: Floyd, Digital Fundamental

RETRIEVING DATA FROM THE DDC112

Bandwidth Calculations for SA-1100 Processor LCD Displays

Motorola 8- and 16-bit Embedded Application Binary Interface (M8/16EABI)

9/14/ :38

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

CHAPTER IX REGISTER BLOCKS COUNTERS, SHIFT, AND ROTATE REGISTERS

Systems I: Computer Organization and Architecture

Flip-Flops, Registers, Counters, and a Simple Processor

Modeling Latches and Flip-flops

Digital Fundamentals. Lab 8 Asynchronous Counter Applications

Enhancing High-Speed Telecommunications Networks with FEC

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

CONTENTS PREFACE 1 INTRODUCTION 1 2 NUMBER SYSTEMS AND CODES 25. vii

Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

Chapter 2 Logic Gates and Introduction to Computer Architecture

ALFFT FAST FOURIER Transform Core Application Notes

Digital Multiplexer and Demultiplexer. Features. General Description. Input/Output Connections. When to Use a Multiplexer. Multiplexer 1.

High-Level Synthesis for FPGA Designs

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

Counters and Decoders

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

Understanding CIC Compensation Filters

AVR319: Using the USI module for SPI communication. 8-bit Microcontrollers. Application Note. Features. Introduction

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Implementing an In-Service, Non- Intrusive Measurement Device in Telecommunication Networks Using the TMS320C31

Manchester Encoder-Decoder for Xilinx CPLDs

Xgig 8G Fibre Channel Analyzer

Combinational Logic Design

ELECTENG702 Advanced Embedded Systems. Improving AES128 software for Altera Nios II processor using custom instructions

A High-Performance 8-Tap FIR Filter Using Logarithmic Number System

MAX II ISP Update with I/O Control & Register Data Retention

73M2901CE Programming the Imprecise Call Progress Monitor Filter

Vivado Design Suite Tutorial

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March ISSN

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Lecture N -1- PHYS Microcontrollers

Chapter 7 Memory and Programmable Logic

Develop a Dallas 1-Wire Master Using the Z8F1680 Series of MCUs

File Management. Chapter 12

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students

Modeling Registers and Counters

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

Hexadecimal Object File Format Specification

21152 PCI-to-PCI Bridge

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Implementation and Design of AES S-Box on FPGA

Design and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays

COMBINATIONAL CIRCUITS

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev Key Design Features. Block Diagram. Generic Parameters.

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

CHAPTER 3 Boolean Algebra and Digital Logic

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

AC : PRACTICAL DESIGN PROJECTS UTILIZING COMPLEX PROGRAMMABLE LOGIC DEVICES (CPLD)

Open Flow Controller and Switch Datasheet

Source Control and Team-Based Design in System Generator Author: Douang Phanthavong

Introduction to Digital System Design

AVR223: Digital Filters with AVR. 8-bit Microcontrollers. Application Note. Features. 1 Introduction

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Systolic Computing. Fundamentals

APPLICATION NOTE BUILDING A QAM MODULATOR USING A GC2011 DIGITAL FILTER CHIP

PLL frequency synthesizer

Simplifying System Design Using the CS4350 PLL DAC

Transcription:

16-Tap, 8-Bit FIR Filter Applications Guide November 21, 1994 Application Note BY G GOSLIN & BRUCE NEWGARD Summary This application note describes the functionality and integration of a 16-Tap, 8-Bit Finite Impulse Response (FIR) filter macro with predefined coefficients (e.g. low pass) and a sample rate of 5.44 mega-samples per second or 784 MIPS using an XC4000-4 device. The application note also describes how to set the coefficients of the FIR Filter to meet the needs of other applications. Xilinx Family XC4000 Demonstrates RAM-based Look Up Tables for parallel-to-serial shift registers. ROM-based Look Up Tables for parallel arithmetic operations Distributed arithmetic for high-speed parallel calculations. MEMGEN and ROM data entry Table of Contents Introduction... 1 FIR Filter Coefficients... 2 Design Implementation Guidelines... 3 Converting the 16-Tap into a 15-Tap FIR... 3 Using the Design Files... 3 Acknowledgments... 4 Introduction Xilinx provides a wide range of programmable logic devices that can be used to develop high-performance DSP products. The XC4000 family of Field Programmable Gate Arrays (FPGAs) offer designers a diverse selection of functionality to build almost any system design. The FPGA can adapt to last minute design modifications as well as future design iterations without making extensive hardware or software modifications saving both time and money. The Xilinx RPM functional implementation of the 16-Tap FIR filter shown in Figure 1 is a cost-effective building block for the creative DSP engineer. The design is based on the Xilinx XC4000 family architecture. The XC4000 family provides ample density to build a complete DSP system in a single device. The FIR macro, as implemented, is a low-pass filter. It can be reconfigured as a high-pass or band-pass filter by simply reprogramming the filter coefficients. The high sample rate (e.g. 5.44 million samples per second) and the ability to change many characteristics of the design including the number of taps makes this FIR filter a versatile macro with a diverse range of DSP applications. Functional D escription The 16-Tap 8-Bit FIR filter is based on a distributed arithmetic algorithm pioneered by Les Mintzer. The filter consists of seven major components: Data Input X[7:0] PSC 0 15 1 14 2 13 3 12 4 11 5 10 6 9 7 8 C0 C1 C2 C3 C4 C5 C6 C7 Figure 1. Data Flow Diagram for 16-Tap FIR Filter. 2-1 Data Output Y[9:0] 1995 Xilinx, Inc. All rights reserved. 1 Version 1.01

16-Tap, 8-Bit FIR Filter Applications Guide a parallel to serial converter (PSC_8), a RAM-based shift register (TSB_8), a serial adder (S_ADD4), a ROM-based Look-Up Table (LUT16X8), a complementing register (_COMP), an adder (_ADD9), and a scaling accumulator (SCAL10). For the following discussion, refer to Figure 5 and Figure 6. The FORMAT input signal indicates whether the incoming data is unsigned binary or 2 s-complement. If FORMAT is High, then the data is unsigned binary. An 8-bit data sample is loaded into the parallel to serial converter (PSC_8) at the sample rate. The PSC generates a serial output stream that is supplied to the RAM-based shift registers at the bit clock rate. The bit clock rate is determined by bit clock = ( n + 1) ( sample rate) where (n+1) represents the number of data bits per sample plus an overflow bit. The RAM-based shift register (TSB_8) functions as a time-skew buffer. The data bit stored in a particular address of a RAM-based shift register is first read and stored in a flip-flop. A new data bit is then written into the particular address replacing the previous data. The data bit stored in the flip-flop is written to the next location. The RAM-based shift register approach rather than a more-traditional cascade of data registers significantly reduces the overall size of the FIR filter. Note the special configuration of the two RAM-based shift registers to control the distribution of data. The upper RAMbased shift register contains TAPS[3:0] and TAPS[15:12] while the lower RAM-based shift register contains TAPS[7:4] and TAPS[11:8]. This configuration is important because it exploits the inherent symmetrical coefficient values used in a linear-phase FIR filter. The outputs of the registered serial adders are presented to the ROM-based lookup tables. Since the coefficients are symmetric, the sums generated by the adders can be multiplied by the same coefficients. Since all possible partial products are pre-computed, the outputs of the serial adders are used to address the ROM-based lookup tables to generate the appropriate multiplication results. The outputs of the registered lookup tables are summed except for the sign result which is complemented before being summed. The registered summation is then fed into a scaling accumulator. The scaling accumulator (SCAL_10) is first loaded with the least significant bit (LSB) result. Each successive result is added to the summation which has been bit shifted to perform a divide by two function. This process continues until the most significant bit (MSB), or sign bit, is present. It should be noted that to generate a 2 s-complement number the sign result is complemented and a carry is forced at the adder and the scaling accumulator. Since two separate sign values were complemented, a two must be added. Once the sign result is accumulated, the final filter result is registered. The overall timing is controlled by a four-bit counter running at the bit rate. Decoded counter outputs generate the control signals needed to store the filter values in the proper sequence. FIR Filter Coefficients The FIR filter coefficients are formatted and stored in ROM-based LUTs. Because the FIR filter coefficients are symmetric, the number of unique coefficients is half the number of taps. Consequently, the 16-taps for this example result in eight unique coefficients. The eight coefficients are split into two 16-word x 8-bit ROMbased LUTs formatted as shown in Figure 3. Address Data Address Data 0000 0 1000 C 3 0001 C 0 1001 C 3 + C 0 0010 C 1 1010 C 3 + C 1 0011 C 1 + C 0 1011 C 3 + C 1 + C 0 0100 C 2 1100 C 3 + C 2 3x 0 + 5x 1 + + 5x 14 + 3x 15 Figure 2. Symmetry of Coefficients. Since the task of summing the 16 taps is split into two sections, TAPS[3:0] and TAPS[15:12] are added and stored in the upper serial adder. TAPS[7:4] and TAPS[11:8] are added and stored in the lower serial adder. 0101 C 2 + C 0 1101 C 3 + C 2 + C 0 0110 C 2 + C 1 1110 C 3 + C 2 + C 1 0111 C 2 + C 1 + C 0 1111 C 3 + C 2 + C 1 + C 0 (e.g. C 0A =-2, C 1A =-4, C 2A =5, C 3A =-5 C 0B =3, C 1B =3, C 2B =-18, C 3B =78) Figure 3. ROM-Based Coefficient LUT SUBJECT TO CHANGE 2

All possible partial products are pre-computed and placed in a memory definition file or.mem file, as shown in Figure 4 (refer to the MemGen Program chapter in the XACT Reference Guide, Volume 1-April, 1994, pages I-59 through I-67). ;DA_LUTA.MEM: A 16-word deep, 8-bit wide ROM TYPE ROM DEPTH 16 WIDTH 8 SYMBOL NONE ;The memory is a ROM ;The memory is 16-words deep ;The memory word is 8-bits wide DEFAULT 0 ;Default value for unspecified locations DATA here ;User defined coef. ROM data starts 00,FE,FC,FA ; 0, C 0, C 1, (C 0+C 1) 05,03,01,FF ; C 2, (C 0+C 2), (C 1+C 2), (C 0+C 1+C 2) FB,F9,F7,F5 ; C 3, (C 0+C 3), (C 1+C 3), (C 0+C 1+C 3), (C 1+C 3) 00,FE,FC,FA ; (C 0+C 1+C 3), (C 2+C 3), (C 0+C 2+C 3), (C 0+C 1+C 2+C 3) ;END OF FILE Figure 4. Memory Definition File Note: After the.mem files are processed by the MEMGEN memory compiler, the RLOC s must be manually changed to match the RLOC s in the original design files. Design Implementation Guidelines The bit clock must use a BUFGS primitive, not a BUFGP. This is because the bit clock is also used to control the write-enable signals in the RAM-based shift registers (TSB_8). A BUFGS connects to the write control signals on a Configurable Logic Block (CLB) while a BUFGP primitive does not. The maximum bit clock rate is 49 MHz using an XC4000-4 device. The 49 MHz clock rate corresponds to 5.44 million samples per second because it takes nine bit rate clock periods to add two eight bit numbers with overflow. The critical path is in the carry chain of the final scaling accumulator. The RAM-based shift registers require a minimum clock width for both High and Low of 9.9 ns. Pipelining the arithmetic functions in the design increases the overall performance but also increases the logic resource requirements. Since the 16-Tap 8-Bit FIR filter design is a Relationally Placed Macro (RPM), it must be placed so that its placement does not conflict with other design modules. The shape of the RPM is a 9 CLB row by 10 CLB column rectangular block and occupies 67 CLBs. Converting the 16-Tap into a 15-Tap FIR The 16-Tap FIR filter design can be changed into a 15- Tap FIR filter with two simple modifications. First, the center two Taps of the 16-Tap FIR must register the same data. This is done by schematically changing the 8th Tap data source from TAP7 to TAP6 (e.g. change the DI_4 input of the lower TSB_8 symbol from TAP7 to TAP6.) Secondly, since the 7th Coefficient corresponds to a single Tap with a 15-Tap FIR and the data registered at the 7th and 8th Taps are equal, the sum of the two Taps must be divided by two. This is done by dividing the 7th Coefficient by 2 (e.g. C 3 in the lower ROMbased look up table, LUT_B, must be changed to C 3 /2.) Using the Design Files This design is available from the Xilinx BBS (see Xilinx Technical Bulletin Board in Section 6 of the Xilinx Data Book.) This section describes what software is required to run the design and the steps involved. Also, please read through the Limitations and Restrictions section. Software Requirements The following software is required to process this design: VIEWdraw or VIEWdraw-LCA schematic editor. This software is required in order to make modifications to the schematics. Xilinx XACT 5.0 FPGA development system, including the PPR place and route program and the X- BLOX module generator. Using the Design on Your System 1. Create a new directory on your hard disk called C:\FILTER. 2. Copy all of the files and sub-directories from the DSP\DESIGNS\ directory on the Programmable Logic Breakthrough 95 CD-ROM. 3. Make sure that the VIEWlogic design library pointers in the viewdraw.ini file are set appropriately for your directory structure 4. Invoke XDM. 5. Set the part type to XC4003PC84-4. 6. Run XMAKE on FIR_LCA.1 to process the design. The schematic file named FIR_LCA.1 is the toplevel schematic. 3 SUBJECT TO CHANGE

16-Tap, 8-Bit FIR Filter Applications Guide Limitations and Restrictions WARNING: Xilinx, Inc. does not make any representation or warranty regarding this design or any item based on this design. Xilinx disclaims all express and implied warranties, including but not limited to the implied fitness of this design for a particular purpose and freedom from infringement. Without limiting the generality of the foregoing, Xilinx does not make any warranty of any kind that any item developed based on this design, or any portion of it, will not infringe any copyright, patent, trade secret or other intellectual property right of any person or entity in any country. It is the responsibility of the user to seek licenses for such intellectual property rights were applicable. Xilinx shall not be liable for any damages arising out of or in connection with the use of the design including liability for lost profit, business interruption, or any other damages whatsoever. Design Support and Feedback This application note may undergo future revisions and additions. If you would like to be updated with new versions of this application note, or if you have questions, comments, or suggestions please send an E-mail to dsp@xilinx.com or a FAX addressed to Corporate Applications FIR Filter Macro at (408) 879-4442. IMPORTANT: Please be sure to include which version of the application note you are using. The version number is in the lower right-hand corner of page 1. Acknowledgments Special thanks to Les Mintzer, research and development consultant and U.C. Irvine professor, for creating the enabling technology described herein. Figure 5. Data Flow Block Diagram for 16-Tap FIR Filter RPM. Figure 6. 16-Tap 8-Bit FIR Filter RPM High-Level Schematic. SUBJECT TO CHANGE 4

Figure 7. 16-tap, 8-bit FIR low-pass filter characteristics. 5 SUBJECT TO CHANGE