An Ultra-low low energy asynchronous processor for Wireless Sensor Networks



Similar documents
Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks

Networked Embedded Systems: Design Challenges

Power Reduction Techniques in the SoC Clock Network. Clock Power

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Lecture 11: Sequential Circuit Design

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India

Interconnection Networks

Lecture 7: Clocking of VLSI Systems

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

7a. System-on-chip design and prototyping platforms

Memory Elements. Combinational logic cannot remember

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs

Implementation Details

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

ISSCC 2004 / SESSION 17 / MEMS AND SENSORS / 17.4

Elettronica dei Sistemi Digitali Costantino Giaconia SERIAL I/O COMMON PROTOCOLS

PowerPC Microprocessor Clock Modes

VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU

System on Chip Design. Michael Nydegger

Introduction to VLSI Programming. TU/e course 2IN30. Prof.dr.ir. Kees van Berkel Dr. Johan Lukkien [Dr.ir. Ad Peeters, Philips Nat.

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1)

Architectures and Platforms

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

IC-EMC Simulation of Electromagnetic Compatibility of Integrated Circuits

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

Service and Resource Discovery in Smart Spaces Composed of Low Capacity Devices

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

ÇANKAYA ÜNİVERSİTESİ ECE 491 SENIOR PROJECT I ERDİNÇ YILMAZ

Design Verification & Testing Design for Testability and Scan

15 th TF-Mobility Meeting Sensor Networks. Torsten Braun Universität Bern

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Testing of Digital System-on- Chip (SoC)

ELEC 5260/6260/6266 Embedded Computing Systems

What is a System on a Chip?

Lecture-3 MEMORY: Development of Memory:

Demystifying Data-Driven and Pausible Clocking Schemes

To design digital counter circuits using JK-Flip-Flop. To implement counter using 74LS193 IC.

An On-chip Security Monitoring Solution For System Clock For Low Cost Devices

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

Low Power AMD Athlon 64 and AMD Opteron Processors

Data Management in Sensor Networks

Computer Organization and Components

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation

Set-Reset (SR) Latch

AC : ELECTRICAL ENGINEERING STUDENT SENIOR CAP- STONE PROJECT: A MOSIS FAST FOURIER TRANSFORM PROCES- SOR CHIP-SET

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Low latency synchronization through speculation

How To Design A Single Chip System Bus (Amba) For A Single Threaded Microprocessor (Mma) (I386) (Mmb) (Microprocessor) (Ai) (Bower) (Dmi) (Dual

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs

Computer Systems Structure Input/Output

Memory unit. 2 k words. n bits per word

2.0 System Description

Serial Communications

1. Memory technology & Hierarchy

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

International Journal of Electronics and Computer Science Engineering 1482

Lesson 12 Sequential Circuits: Flip-Flops

Atmel Norway XMEGA Introduction

In-Vehicle Networking

System Design Issues in Embedded Processing

Introduction to Digital System Design

Lecture 10: Sequential Circuits

Sensor network infrastructure for intelligent building monitoring and management system

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA

CSC 774 Advanced Network Security. Outline. Related Work

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance

Flip-Flops, Registers, Counters, and a Simple Processor

Programming Logic controllers

AMD Opteron Quad-Core

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

路 論 Chapter 15 System-Level Physical Design

Flexible Online Energy Accounting in TinyOS

2.0 Command and Data Handling Subsystem

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

CS261 Project - Sensor Network Programming for Dummies

Implementing Software on Resource- Constrained Mobile Sensors Experience with Impala and ZebraNet

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

VHDL GUIDELINES FOR SYNTHESIS

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

PEDAMACS: Power efficient and delay aware medium access protocol for sensor networks

Introducción. Diseño de sistemas digitales.1

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Microtronics technologies Mobile:

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency

Am186ER/Am188ER AMD Continues 16-bit Innovation

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics

Chapter 02: Computer Organization. Lesson 04: Functional units and components in a computer organization Part 3 Bus Structures

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

A case study of mobile SoC architecture design based on transaction-level modeling

A New Paradigm for Synchronous State Machine Design in Verilog

Transcription:

An Ultra-low low energy asynchronous processor for Wireless Sensor Networks.Necchi,.avagno, D.Pandini,.Vanzago Politecnico di Torino ST Microelectronics

Wireless Sensor Networks - Ad-hoc wireless networks - Sensing - Computation - Actuation Application areas: Monitoring Building automation Health care, Medical Emergency response Automotive Async 06 - March 13-15 2

3 Key WSN Requirements Flexibility (general purpose design) High energy efficiency (battery powered) Extremely wide voltage supply range Exhausted battery or energy scavenging Fast and inexpensive wake-up event driven power management (not predictable) Sporadic high computational load Encryption (security) Aggregation, distributed data processing

Sensor node architecture Main components of a WSN node: Microcontroller Atmel AVR TI Memory MSP430 Radio Sensors / Actuators Power supply Battery (energy storage) Power scavenging Async 06 - March 13-15 4

5 Circuit-level Power Management Management Clock Gating Power Gating Dynamic Voltage Scaling Adaptive Body Biasing Save energy while Idle X X Active X X Scenario Idle Time ong Deadlines DVS can be obtained by: Off-line pre-computed voltage/frequency tables High delay margins Evaluated on-line: PowerWise,, Razor, Asynchronous, De-synchronization

Closed-loop loop DVS technique PowerWise: Samples, with a high frequency clock, the output of a digital delay line, and arrange voltage supply to deliver required performance Razor: Detects timing errors comparing values stored in duplicated slave latches, in which the second is clocked half clock cycle later, restarts the pipeline and arranges voltage supply accordingly Asynchronous with Dual-Rail encoding: (Quasi) delay insensitive implementation, that guarantees correctness for (almost) every voltage supply and process variation Asynchronous with Bundled Data encoding: A digital delay line output is directly used to generate a local clock signal, resulting in a direct dependence between voltage supply and delay period Async 06 - March 13-15 6

7 De-synchronization Synchronous Desynchronize Asynchronous CK CK

8 Design Flow HD RT Synthesis & Optimization Obtain asynchronous implementation from synchronous specification: ibrary Netlist De-synchronization Netlist Physical Design ayout Think synchronously Design synchronously De-synchronize (automatically) Test synchronously Run asynchronously

Synchronous circuit MS flip-flop 0 1 0 1 CK 0 0 Async 06 - March 13-15 9

De-synchronization 0 1 0 1 C C C C C 0 C 0 Async 06 - March 13-15 10

11 De-synchronization Distributed micropipeline-style controllers substitute the clock network C C C C C C The data path remains intact!

12 Flow equivalence [Guernic, Talpin, ann, 2003] A B

13 Flow equivalence [Guernic, Talpin, ann, 2003] K A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 De-synchronized behavior

Flow equivalence [Guernic, Talpin, ann, 2003] K A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 De-synchronized behavior Theorem: The de-synchronization model preserves flow-equivalence Async 06 - March 13-15 14

15 Flow equivalence [Guernic, Talpin, ann, 2003]

De-synchronization Benefits For the end user: Reduced electromagnetic emission Process Variation tolerance Enables partial average case design, Async 06 - March 13-15 wrt process & environment variation (not wrt data-dependent dependent delay) The resulting circuit will be: Ready for frequency and voltage scaling Inherently more robust to delay variations Virtually no performance or area overhead wrt synchronous For the designer Conventional EDA Tools and design flow imited design time and effort, fully automated Re-use legacy designs 16

17 Asynchronous advantages not offered by de-synchronization Fine-grained power management The desynchronized circuit inherits the synchronous clock gating Fine-grained pipelining The pipeline structure is not changed Data-dependent delays Could be exploited by using a datapath with completion detection (work in progress) Robustness with respect to uncorrelated local variability Would require completion detection

18 Synchronous ogic Interfacing C C C 0 1 0 1 0 1 FAST C C C OGIC Data path (not modified) Handshaking line

19 Synchronous ogic Interfacing C C C 0 1 0 1 0 1 SOW C C C OGIC Synchronized with an external slower clock -Just low EMI External CK

20 Synchronous ogic Interfacing C C C 0 1 0 1 C C 0 1 C SEF TIMED OGIC Example: SRAM with Completion Detection

21 Sensor node architecture Main components of a WSN node: Microcontroller Memory Radio Sensors / Actuators Power supply Battery (energy storage) Power scavenging Atmel AVR

Our Case Study Application independent 8 Bit CPU architecture: Atmel AVR Instruction Set (like MICA2 - MICAZ) from OpenCores.org,, implemented with a 130nm technology Toolchain and lots of software are ready to use nesc, TinyOS, TinyDB,, Surge, Tossim Aggressive Energy management enabled by de-synchronization, using: Dynamic Voltage Scaling zero wake-up time (No CK, no wait for P to restart) 22

23 Typical AVR architecture INSTR. Memory DATA Memory Instruction FETCH 0 1 MEM Instruction Access DECODE AU Execution Data Path (8 bit) External CK Address bus Clk distribution

24 Design Choices Main target is energy efficiency (vs( speed) arge delay margins (100%) to increase robustness at low voltage supply AVR core is really small (~4500 gates), hence we used a Single controller Reduced area overhead No electro magnetic emission reduction

25 De-synchronized AVR INSTR. Memory DATA Memory Instruction FETCH 0 1 MEM Instruction Access DECODE AU Execution C Data Path Address bus Handshake signal distribution Delay chain

26 ogic and Delay ine Matching

Energy Efficiency Energy per Instruction Power Consumption Async 06 - March 13-15 eakage per instruction Voltage Supply [V] ogic Delay 27

28 Energy Efficiency

29 Some Past Work Comparison Philips 80c51 (H. van Gageldonk., 1998) Asynchronous bundled-data data implementation of the 8051 ISA, general purpose. utonium (A. Martin et al., 2003) Asynchronous QDI implementation of the 8051 ISA. Snap/le (V. Ekanayake et al., 2004) Asynchronous QDI processor specifically designed for WSN. Razor (D. Ernst et al., 2004) Synchronous processor that estimated the best Vdd by dynamically monitoring the delay of the logic using a redundant latching schema.

30 CONCUSIONS Aggressive Energy management using DVS 14 pj/instr @ 1.2 V (170 MIPS) 2.7 pj/instr @ 0.51 V ( 48 MIPS) Minimal overhead wrt synchronous counterpart +6% area (due to FF->latch conversion) -20% speed (could be improved by reducing margins) Future work: Analysis with other SPICE-like simulators (Hsim( Hsim) Statistical simulations to check robustness wrt process variability (Monte Carlo) Fabrication (?)