Interconnection Network of OTA-based FPAA



Similar documents
System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Chapter 2. Multiprocessors Interconnection Networks

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7

Scaling 10Gb/s Clustering at Wire-Speed

Interconnection Network Design

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

On-Chip Interconnection Networks Low-Power Interconnect

Components: Interconnect Page 1 of 18

Scalability and Classifications

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

PLAS: Analog memory ASIC Conceptual design & development status

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Digital to Analog Converter. Raghu Tumati

ISSCC 2003 / SESSION 10 / HIGH SPEED BUILDING BLOCKS / PAPER 10.5

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

Interconnection Networks

DigiPoints Volume 1. Student Workbook. Module 4 Bandwidth Management

Using Pre-Emphasis and Equalization with Stratix GX

Topological Properties

Local-Area Network -LAN

W a d i a D i g i t a l

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Elettronica dei Sistemi Digitali Costantino Giaconia SERIAL I/O COMMON PROTOCOLS

Data Center Switch Fabric Competitive Analysis

1. Memory technology & Hierarchy

Memory Systems. Static Random Access Memory (SRAM) Cell

Juniper Networks QFabric: Scaling for the Modern Data Center

Process Control and Automation using Modbus Protocol

PowerPC Microprocessor Clock Modes

1.1 Silicon on Insulator a brief Introduction

Chapter 2 Data Storage

Buffer Op Amp to ADC Circuit Collection

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Impedance 50 (75 connectors via adapters)

Chapter 2 Heterogeneous Multicore Architecture

CMOS, the Ideal Logic Family

Lecture N -1- PHYS Microcontrollers

LEVERAGING FPGA AND CPLD DIGITAL LOGIC TO IMPLEMENT ANALOG TO DIGITAL CONVERTERS

MEASUREMENT UNCERTAINTY IN VECTOR NETWORK ANALYZER

Lecture 2 Parallel Programming Platforms

Interconnection Network

What to do about Fiber Skew?

SPADIC: CBM TRD Readout ASIC

ZL40221 Precision 2:6 LVDS Fanout Buffer with Glitchfree Input Reference Switching and On-Chip Input Termination Data Sheet

SAN Conceptual and Design Basics

ε: Voltage output of Signal Generator (also called the Source voltage or Applied

Computer Networks. Definition of LAN. Connection of Network. Key Points of LAN. Lecture 06 Connecting Networks

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

Gates, Circuits, and Boolean Algebra

Infrastructure Components: Hub & Repeater. Network Infrastructure. Switch: Realization. Infrastructure Components: Switch

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

The UC Berkeley-LBL HIPPI Networking Environment

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

T1 Networking Made Easy

AN460 Using the P82B96 for bus interface

Maximizing Server Storage Performance with PCI Express and Serial Attached SCSI. Article for InfoStor November 2003 Paul Griffith Adaptec, Inc.

Architectures and Platforms

SCSI vs. Fibre Channel White Paper

Read this before starting!

Cancellation of Load-Regulation in Low Drop-Out Regulators

Gates. J. Robert Jump Department of Electrical And Computer Engineering Rice University Houston, TX 77251

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 3, SEPTEMBER

PCI Express Overview. And, by the way, they need to do it in less time.

Fundamentals of Signature Analysis

PLL frequency synthesizer

SECTION 2 TECHNICAL DESCRIPTION OF BPL SYSTEMS

Naveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi

InfiniBand Clustering

Industrial Ethernet How to Keep Your Network Up and Running A Beginner s Guide to Redundancy Standards

A New Chapter for System Designs Using NAND Flash Memory

Course 12 Synchronous transmission multiplexing systems used in digital telephone networks

Non-blocking Switching in the Cloud Computing Era

Computer Systems Structure Main Memory Organization

PROGRAMMABLE ANALOG INTEGRATED CIRCUIT FOR USE IN REMOTELY OPERATED LABORATORIES

Principles and characteristics of distributed systems and environments

A Short Discussion on Summing Busses and Summing Amplifiers By Fred Forssell Copyright 2001, by Forssell Technologies All Rights Reserved

Analog vs. Digital Transmission

Computer Systems Structure Input/Output

Use and Application of Output Limiting Amplifiers (HFA1115, HFA1130, HFA1135)

Comparison of Network Coding and Non-Network Coding Schemes for Multi-hop Wireless Networks

A Novel Low Power Fault Tolerant Full Adder for Deep Submicron Technology

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Analysis and Design of High gain Low Power Fully Differential Gain- Boosted Folded-Cascode Op-amp with Settling time optimization

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

The ABCs of Spanning Tree Protocol

Public Switched Telephone System

Interfacing Analog to Digital Data Converters

A Multichannel Storage Capacitor System for Processing of Fast Simultaneous Signals. Atsushi Ogata. IPPJ-T-16 Oct. 1973

Extending the Internet of Things to IPv6 with Software Defined Networking

Datagram-based network layer: forwarding; routing. Additional function of VCbased network layer: call setup.

Current vs. Voltage Feedback Amplifiers

8 Gbps CMOS interface for parallel fiber-optic interconnects

Lecture 24. Inductance and Switching Power Supplies (how your solar charger voltage converter works)

Transcription:

Chapter S Interconnection Network of OTA-based FPAA 5.1 Introduction Aside from CAB components, a number of different interconnect structures have been proposed for FPAAs. The choice of an intercmmcclion archilt:cture and its implementation will influence the routability of prolotyped circuits and their performance. Analog circuits are far more sensitive than digital circuits to problems of fan-out, noise, and the presence of switches in the signal path [ 41]. 5.2 Design Goals for Interconnect Network a) Provide adequate flexible The network must be capable of implementing the interconm:clion topology required by the programmed logic network with acceptable intcrconncct delays [431. b) Use configuration memory efficiently Space required for configuration memory can accounl for a reasonable fraction of the array real-estate. Configuration bits should be used efficiently. c) Buluncc bisection bandwidth Interco1U1ection-wiring takes space and in some topologies, dominates the array size. The wiring topology should be chosen to balance interconnect bandwidth with array size so that neither strictly dominates the other. 57

d) Minimi:ie delnys The delay through the routing network can easily be the dominate delay in a programmable technology. Care is required to minimize interconm:ct delays. Two significant factors of delay are, i) Propagation and fan-out delay - Interconnect delay on a wire is proportional to distance and capacitive loading (fan-out). This makes interconnect runs roughly proportional 10 distance run, especially when there are regular taps into the signal run. Consequently, small/short signal runs arc faster than long signal runs. ii) Switched element delay - Each programmable switching element in a path (crossbar, multiplexor etc.) adds delay. This delay is generally much larger than the propagation or fan-out delay associated with covering the same physical distance. Consequently, one generally wants to minimize the number of switch elements in a path, even if this means using some longer signal runs. Switching can be used to reduce fan-0ut on a line by segmenting tracks, and large fan-out can be used to reduce switching by making a signal always available in several places. Minimizing the interconnect delay, therefore, always requires technology dependent tradeoffs between the amount of switching and the length of wire runs [43). 5.3 Different Types of Interconnection Network 5.3.1 Local Interconnection At the base, of the interconnect hierarchy; there are local connections among array elements (Fig.5.1 ). These are generally worth keeping in any scheme because short, local interconnect can be fast and efficient (small fan-out and propagation delay) local 58

interconnect resources are relatively cheap. ~11b regions of logic networks arc uficn heavily interconnected. How far locul interconnect spans is. of course. highly dependent on the technology costs (e.g. how expensive is switching delay versus propagation dolny? how much :;pncc do configuration bit:> requin:? how big are wires relative to fixed logic and memory?) and the detailed choice of a local interconnect scheme [1 5,45]. ' I!f: I C:\H "' C":\H,._.,. -j_ ~ - C..\Il ~..., <"AB,... ~ ~.. ~ I Figure 5.1 Local interconnection scheme - '"' 1.\n.. '"<'AH.. ~... "'.. 1'.\8 "'.. <'. \B.... " Figure 5.2 Global interconnection scheme Quan et al. [51] proposed the use of local interconnects. In their architecture, each CAB can be connected to its eight neighbours and itself. This would seem to be a severe limitation on the flexibility of this FPAA; however, they focus on the large number of analog circuits with mostly local interconnections (51 ]. Pierzchala et al. tried an even 59

more limiting architecture in which no electronic switches were included in Lhe signal paths. While these designs may provide benefits in bandwidth and signal to noise ratio (SNR), they lack the flexi bility and generality needed in n truly general purpose FPJ\A. Also one more FPAA having local interconnection nct\vork have bet:n reported from rechnical University of Gdansk, Poland. which utilized this structure very wc11 by using a highly linear OT A ns main computational element that eliminates the rcquimm:nt of flexible interconnect structure [I 5]. 5.3.2 Global Interconnection ln another design Fig. 5.2, nn interconnect scheme with both local and global signal paths are introduced [ IS]. This configuration provided local routing paths for a cell's four neighbors (north, south, east and west), as well as connections to global busscs that run horizontally, vertically, and diagonally. This two tiered hierarchy incrcasi.;s the routing t1exibility within the FPAA [ 15). 5.3.3 Crossbar Interconnection A crossbar int.:rconncc1 network allo"\:> :.imultaneous connections from any input port to any output port. Fig. 5.3 shows a conceptual view and Fig. 5.4 shows the interconnection scheme of such a network. One possible implementation of the crossbar network is to assign a global bus to each input, and each of these busses can be connected to ony outputs through o switch. Observe thnt s uch a network requires only one switching stage that is, every input and output pair is connected through a single switching element. This architecture provides full connection flexibility, but suffers from a large area 60

overhead {wa.:iting wile resources) and a high-energy consumption (due to the long global buses and the large number ofswitches)[52]. N inputs - I! c LI : t!t-l-fta tlttj rn-. f '.\H ' I <" \D i ii t j i 1 L - I FFH A,-- CA I~ Figure 5.3 Conceptual crossbar network Figure 5.4 Cross-biar interconnection scheme 5.3.4 Fat-Tree Interconnect Network The "fat tree" interconnect network, shown in Fig. 5.5 und.t.. ig.5.6, is a network based on a complete (binary) tree. Logic blocks are located at the leaves of the network tree. The number ofbuses per channel increases as going up the tree [52]. Lee and Gulak [44] tried to solve this problem by using an area-universal fat tree network. They used a hierarchical fat tr~1: network of small crossbar switches where the CABs were connected as the leaves of the tree. In an additional effort to minimize the size required by the switch networks, the number of connections wa.c: M nstraincd (Lee and Gulak). Unfortunately, their prototype was Loo small to really test this interconnect concept. 61

Figure 5.5 Conceptual fat-tree interconnect <. \B.. (.\ II ('. \B ':;: t: n ~\, '/) s,... t.\h C:\H,~ L \H f t ('. \B I ' n '/'l ~ :sw : ~ ~ ' ( ".\ B Figure 5.6 Fat-tree interconnection scheme 5.4 Interconnection Network of OTA-based FPAA The structure of FPAA shown in Fig. 5. 7 [ 14) consists of 40 CABs positioned in eight columns and five rows. Additionally three OTAs ol-o3 acts as signal buffers. Input signals are delivered through lines, i 1 and i3. Because of this, up to three different filters can be realized simultaneously. The transconductance parameters of all the OTAs are controlled by external voltage and through digital switching of the output current mirrors. While voltage is common for all the amplifiers in the array, it is still possible to set the transconductance of every OTA separatdy by sening the gain of the OTAs current 62

mirror. The structure of the FPAA is composed of universal configurnble nnalos blocks (CABs). CAO shown in Fig. 5.8 consists of one programmable fully differential OTA, Figure 5.7 Structure of FPAA one programmable capacitor CEQ and a set of switch S 1- S 12. The switches are placed in such a way that the OT A can be connected with or without the capacitor. It is also possible to pass the signals of other CABs through the switches situated nt the top tmd the bottom o f the CAB. Switches S I and S2 enable to connect the input lo the OTA directly or in an inverse way. The polarity of OT A~ input signal could be straight or inverse, depending on the settings of switches SI and S 12, respectively [ 14 ). ~-... -----~ :n / ll..., ~<(/ l 11-----.-.. > fr l - or ' - -~ r F: u. - s,- s;.,.. -<$,-- -~ :-:- (,.,, r l-., ; ~-. ---- - l~---.. Figure 5.8 Structure of CAB -::1 63

5.5 Proposed OTA-based FPAA Implementation There ore different methods for arranging FP AA architectures constituting CABs and their routing network are studied. mainly based 011 'array' structure. Since nrca and efficiency is our concern a m:w routing architecture that reduces redundancy of the architeclure was developed, i.e. reducing the number of unused CABs while implementing an analog processing function. So instead of' array' architecture which is a conu11on :structure for f PAA implementation, a method of 'CAB clustering' in which FP AA is divided into a number of clusters und each cluster i:; having four fully differential programmable OTAs as CABs and differential switches was implemented, as shown in Fig. 5.9 and the versatile cluster consisting of four CABS are shown in Fig. 5.10. In comparison to the FPAA introduced by B. Pankiewicz el nl r 14). o ur Fl'AA uses less number of switches with the same functionality. In e<ich cluster shown in Fig. 5.9, CAB I and CAB3 contain programmable fully differential OT A, while CAB2 and CAB4 are programmable fully differential OT As with one programmable capacitor for each CAB. The switches are also differential switches and placed in such a way that the OT A can be connected with or without the capacitor. It is also possible to pass the signals of other CABs through the switches situated at the top and the bottom of the CAB. One input and one output terminal is available from each cluster. 64

i i i2 i3 Cluster A.3 Cluster 83 Cluster C3 Cluster 03 Cluster D Cluster f j Cluster C3 Figure 5.9 New architecture of C ABs and routing network in the proposed FPAA The Clusters and their arrangement are designed mainly for continuous time filter applications. Heterogeneous CABs used for realizing the different basic building blocks like grounded resistor. grounded inductors and noating inductors. The two PCA are used as grounded capacitor as well as floating capacitor. 6 o incll OtC$ S111tcll Figure 5.10 Structure of one cluster 65

5.6 Proposed FPAA using Clusters The FPAA used for implementation of filters is presented in Fig. 5.9. It consists of 21 clusters and each cluster is having 4 CABs and 41 switches. Clusters are positioned in seven columns and three rows. Additionally three OT As o 1 and o3 - act as signal buffers. Input signal:s are delivered through lines i l,i2 and i3. Because of this, up to three different filters can be realized simultaneously. The transconductum.:e parameters of all the OTAs arc controlled by external voltage control voltage, called "Vciri". and through current mirrors. While voltage "Vciri", is common for all the anlplificrs in the array, it is still possible lo set the transconductance of every OTA separately by setting the gain of the OTAs current mirror. 66