Switch Fabric Implementation Using Shared Memory



Similar documents
DESIGN AND VERIFICATION OF LSR OF THE MPLS NETWORK USING VHDL

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Fast Ethernet and Gigabit Ethernet. Networks: Fast Ethernet 1

DigiPoints Volume 1. Student Workbook. Module 4 Bandwidth Management

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

1. Memory technology & Hierarchy

CCNA R&S: Introduction to Networks. Chapter 5: Ethernet

Gigabit Ethernet. Abstract. 1. Introduction. 2. Benefits of Gigabit Ethernet

Chapter 3 ATM and Multimedia Traffic

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Overview of Asynchronous Transfer Mode (ATM) and MPC860SAR. For More Information On This Product, Go to:

Computer Systems Structure Input/Output

- Hubs vs. Switches vs. Routers -

hp ProLiant network adapter teaming

Application Note 132. Introduction. Voice Video and Data Communications using a 2-Port Switch and Generic Bus Interface KSZ MQL/MVL

It explains the differences between the Plesiochronous Digital Hierarchy and the Synchronous Digital Hierarchy.

Software Defined Networking (SDN) - Open Flow

Am186ER/Am188ER AMD Continues 16-bit Innovation

Chapter 1 Reading Organizer

Fiber Channel Over Ethernet (FCoE)

Communication Networks. MAP-TELE 2011/12 José Ruela

Fast Ethernet and Gigabit Ethernet. Computer Networks: Fast and Gigabit Ethernet

What You Will Learn About. Computers Are Your Future. Chapter 8. Networks: Communicating and Sharing Resources. Network Fundamentals

MLPPP Deployment Using the PA-MC-T3-EC and PA-MC-2T3-EC

Layer 3 Network + Dedicated Internet Connectivity

11/22/

Introduction to Wide Area Network Protocols

OpenSPARC T1 Processor

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Intel DPDK Boosts Server Appliance Performance White Paper

UPPER LAYER SWITCHING

Data Communication and Computer Network

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

OC By Arsene Fansi T. POLIMI

Performance Analysis of Switches in High Speed LAN

Design and Verification of Nine port Network Router

Asynchronous Transfer Mode Switching Nathan J. Muller

LAN Switching and VLANs

Exhibit n.2: The layers of a hierarchical network

Advanced Core Operating System (ACOS): Experience the Performance

Network Layer: Network Layer and IP Protocol

Voice Over IP. MultiFlow IP Phone # 3071 Subnet # Subnet Mask IP address Telephone.

SFWR 4C03: Computer Networks & Computer Security Jan 3-7, Lecturer: Kartik Krishnan Lecture 1-3

Open Flow Controller and Switch Datasheet

SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: wireless embedded network

WAN. Introduction. Services used by WAN. Circuit Switched Services. Architecture of Switch Services

Management of Telecommunication Networks. Prof. Dr. Aleksandar Tsenov

Cisco 7200 Series Router Architecture

Latency on a Switched Ethernet Network

Computer Architecture

Broadband Networks. Prof. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Mumbai.

Course Description. Students Will Learn

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

CMPT 165: The Internet, Part 3

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.

How To Monitor And Test An Ethernet Network On A Computer Or Network Card

Architecture of distributed network processors: specifics of application in information security systems

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus

A N. O N Output/Input-output connection

Connection Services. Hakim S. ADICHE, MSc

21152 PCI-to-PCI Bridge

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

ESSENTIALS. Understanding Ethernet Switches and Routers. April 2011 VOLUME 3 ISSUE 1 A TECHNICAL SUPPLEMENT TO CONTROL NETWORK

Research on Video Traffic Control Technology Based on SDN. Ziyan Lin

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

UMBC. ISA is the oldest of all these and today s computers still have a ISA bus interface. in form of an ISA slot (connection) on the main board.

Communications and Computer Networks

Transparent LAN Services Offer Visible Benefits

Local Area Networks transmission system private speedy and secure kilometres shared transmission medium hardware & software

Inverse Multiplexing ATM, Bit by Bit Robin D. Langdon

WAN Data Link Protocols

Ring Local Area Network. Ring LANs

TAA: Introduction to Wide Area Networks online course specification

Network Management for Common Topologies How best to use LiveAction for managing WAN and campus networks

Asynchronous Transfer Mode

Token Ring and. Fiber Distributed Data Interface (FDDI) Networks: Token Ring and FDDI 1

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

IP - The Internet Protocol

Transport and Network Layer

Business Case for the Brocade Carrier Ethernet IP Solution in a Metro Network

Region 10 Videoconference Network (R10VN)

Module 15: Network Structures

An ATM WAN/LAN Gateway Architecture

Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB _v02

WAN Technology. Heng Sovannarith

Performance and Recommended Use of AB545A 4-Port Gigabit Ethernet Cards

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

PCI Express* Ethernet Networking

Introduction to Optical Networks

Implementing VoIP support in a VSAT network based on SoftSwitch integration

Overview of Routing between Virtual LANs

Microprocessor & Assembly Language

Course 12 Synchronous transmission multiplexing systems used in digital telephone networks

What is a System on a Chip?

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

A low-cost, connection aware, load-balancing solution for distributing Gigabit Ethernet traffic between two intrusion detection systems

Flexible SDN Transport Networks With Optical Circuit Switching

Auspex Support for Cisco Fast EtherChannel TM

Making Ethernet Over SONET Fit a Transport Network Operations Model

Transcription:

Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today s businesses and services are heavily dependent on the telecommunications infrastructure. Computers and networks are inextricably linked in business today. Just as processors have increased their speed in instructions per second, users have demanded more bandwidth in megabits to gigabits per second from their networks. Many high speed technologies have emerged or are emerging including: FDDI, CDDI, Fast Ethernet, Gigabit Ethernet, and Asynchronous Transfer Mode (ATM). Only ATM has the ability to integrate voice, video, and data over a single medium and requires a great deal of high speed bandwidth. One of the major ways that switch, router, and hub manufacturers have been able to differentiate their products has been in the fabric and buffer management. The shared or distributed cell or frame fabric as well as the input and/or output buffers determine: 1. The scalability of the network. 2. The non blocking capacity of the network. 3. The throughput experienced by the users. 4. Each user s service levels. These issues are common across all of the network devices. In the carrier network, ATM is widely used because it reduces the number of network overlays and provides seamless integration of LAN and WAN equipment. The switch fabric buffering scheme is of major importance to the flexibility and adaptability of the network. For the end user, new markets are being developed that require fast efficient buffering schemes; such as multimedia support, bandwidth on demand, and high speed data transfer for internet and access to the corporate headquarters. For the data LAN networks, Gigabit Ethernet provides an easy migration to gigabit speeds by upgrading the existing legacy Ethernet 10BaseT and 100BaseT systems. The higher throughput and the need to match different speeds on inputs and outputs also requires a great deal of shared memory fabric and buffering. SWITCH FABRIC The function of the common block of memory or shared fabric is to store then transfer cells or frames between other functional blocks in the switch/router. Examples of the functions of fabric are: 1. Cell or frame buffering. 2. Traffic concentration and multiplexing. 3. Redundancy for fault tolerance. 4. Support of multi casting or broadcasting. 5. Cell scheduling based on delay priorities. 6. Selective cell discarding based on loss priorities and congestion level. 7. Congestion monitoring. There are many types of fabric architectures and choosing one usually depends on where the switch will exist in the network and the amount of traffic it will be required to carry. In reality, fabric implementation is often a combination of two or more basic architectures. Figure 1 illustrates the variety of possible switch fabric architectures. The type of architecture that is common to the majority of network devices is the shared fabric memory (see Figure 2). Freescale has several products available for shared memory implementation including Late Write, BurstRAM, and a new product, NetRAM. There will be slightly different implementations for each design, but the basic functionality will remain the same. In shared fabric implementations, fast SRAMs are used to buffer data before transmitting it. The memories are organized in banks to create a shared memory block of 576 bits wide for ATM and 64 bytes wide for Ethernet. The depth of the fabric will be determined by the bandwidth required of the ports for input and output. With dynamic allocation, the advantage of shared fabric is that it enhances the flexibility of determining network throughput and guarantees that data will not be blocked or dropped as it passes through the switch. Motorola s new NetRAM, a dual port SRAM with configurable input/output data ports is designed specifically for the communications market to deliver optimal performance for networking devices that demand write/read/write cycles. In this application note, two implementations of switch fabric will be discussed using the NetRAM. 1. The One Way switch fabric implementation with separate input and output ports. 2. The Snoop switch fabric implementation with two dual ports: each capable of supporting input and output. REV 1 For More Information On This Product

nc. Figure 1. Types of Fabric Architectures Figure 2. Shared Memory Switch Fabric 2

nc. ONE W SWITCH FABRIC In the first implementation (Figure 2), an ATM example realizes output buffering and can support multiple qualities of service (different priorities) and multicast/broadcast. Inputs to the common shared memory fabric are time division multiplexed (TDM) so that only one input port is granted permission to deposit a cell into the shared memory. As the cell traverses the memory write controller, the output port destination is decoded and used to determine which output queue is to receive the cell. A free buffer address is generated to provide a write address for the cell in the fabric. In addition, this address is linked to the tail of the output queue managed by the memory read controller. Cells leaving the fabric are time division demultiplexed so that only one output port has access to the fabric. Arbitration is usually necessary, because there may be many cells competing for access to the output port. It is the responsibility of the memory read controller to determine which cell wins the arbitration and is forwarded to the output port. Once the cell has been transmitted, the available fabric address is returned to the free fabric address pool. An implementation of switch fabric using NetRAM in a one way path is shown in Figure 3. Input modules or line cards are connected to the NetRAM data inputs, and output modules or line cards are connected to the data outputs. The read address port and write address ports are connected to the 15 OR 16 external controller or application specific integrated circuit (ASIC). As noted previously noted, the ASIC generates the memory address for the cell to be stored until the destination output port becomes available. The store address is sent to the port of NetRAM and the cell is written from the input port () to the fabric after the write enable signal (WY) is activated. The timing diagram is shown in Figure 4. When a destination output port for the data cell becomes available, the ASIC recalls the read address of where it had previously stored the cell and sends it to the port on NetRAM. After the read is performed, the data is available at the output port () two internal clock cycles later providing output enable (GX) is activated at that time. Notice that the internal clock is running at twice the external clock. Because of this, board frequencies can be maintained at frequencies equal to or less than 83 MHz while the device delivers performance equivalent to a conventional 166 MHz memory. NetRAM is a pipelined part with reads occurring before writes. Therefore, if a read and a write occur at the same address on the same clock edge, the read data available from the location will be the cell or frame that was previously in memory and the write data will be written after the read has occurred. Thus, NetRAM resolves potential address contention situations without the need of external arbitration logic or special ASICs to make sure the data is transferred without corruption. CONTROL LOGIC (ASIC) 15 OR 16 MCM69D536 OR MCM69D618 (NetRAM) FROM LINE INPUT 36 OR 18 36 OR 18 OUTPUT TO LINE Figure 3. One Way Switch Fabric Implementation Using NetRAM as a Separate I/O Device 3

nc. PORT Y WY D(2) D(3) D(4) D(5) D(6) D(7) tkhkh tklkh tkhkl PORT X K GX Q(1) Q(2) Q(3) D(4) D(5) D(6) Figure 4. Read/Writing Timing Diagram for One Way Switch Fabric There are additional functions available to the designer. The pass through function allows the data to be sent from the input to the output, bypassing the memory. Consider cycle t4: pass through (PTY) and write enable (WY) are asserted and as a result, data on NetRAM input port Y () gets written to address A(4) and is passed through to the output at port X () and becomes available on the output at cycle t6. In a situation where the system is trying to get data from the input port to the output port as soon as possible, the pass through function saves two cycles of system time over a RAM without the pass through function. Without this function, to send data from memory input to memory output, the user would have to wait until data was written to the address and then send a read address to the address port. The write and pass through function allows the data to be sent from the input to the output, and the data also is written to memory in the normal fashion. Figure 5 shows both pass through and write and pass through timing for moving data from port Y to port X. NetRAM s advantage over conventional single address and common I/O RAMs is that it allows reads and writes to be performed at different addresses during the same clock cycle. This dual address feature gives the NetRAM a very high throughput of this one way fabric of 2.98 Gbits/sec which is sufficient to support up to OC 12 throughput. In addition, most fabric implementations bank several memories together in parallel. In the example given, assume that 16 memories are banked together and thus the total available throughput would be 16 x 2.98 Gbits/sec or 47.8 Gbits/sec. Consider the flexibility of this part if the designer were to double up on the data input and data output. The configurable data ports allow the designer to read simultaneously from both ports in the same clock cycle providing a throughput of approximately 6 Gbits/sec with one device. With a 576 bit wide memory block and 16 NetRAMs connected in parallel (Figure 6), the maximum throughput becomes approximately 96 Gbits/sec. This is one of NetRAMs advantages when configured in the snoop switch fabric implementation as described in the following section. 4

nc. tkhkh tklkh tkhkl K PORT X GX Q(1) Q(2) Q(3) D(4) D(5) D(6) E PORT Y WY PTY D(2) D(3) D(4) D(5) D(6) D(7) Figure 5. Write and Pass Through Timing for One Way Switch Fabric 5

nc. SYSTEM BUS ADDR 15 BITS OR 16 BITS MCM69D536 OR MCM69D618 (NetRAM) ADDR 15 OR 16 BITS ASIC DATA 36 OR 18 BITS DATA 36 OR 18 BITS Figure 6. Snoop Switch Fabric Using NetRAMs as Dual I/O Devices SNOOP SWITCH FABRIC The snoop switch fabric implementation of NetRAM is shown in Figure 6. The NetRAM is configured as a common I/O with the system bus utilizing port for storage and retrieval of cells or frames. While at the same time, on the ASIC dual port side (), the system is able to snoop into any address to retrieve any data that the designer wants to verify such as for Ethernet destination addresses or to modify such as the VPI/VCI headers for ATM. It is common for the designer to include error checking bits or other intellectual property data that the system needs to have available as soon as the cell or frame begins to arrive in the shared memory. This makes it possible for decisions to be made as early as possible in the data input cycle, such as the next destination or priority of service. The ability to write back data at any address is one of the major advantages. This implementation takes away the need to have a separate path to separate out this critical information from the data flow, because the entire cell or frame is placed into the shared memory, thus reducing chip count and board space. The snoop dual ported fabric can be configured as a 64K x 18 and 32K x 36 in a dual ported configuration, which is larger than any other dual ports on the market and the first of many dual ported products that will be available from Freescale. Consider the timing diagram in Figure 7. Assume that in cycle t2, the cell or frame D(2) gets written to address A(2) in the fabric. In cycle t3, this cell data can be snooped or read from address A(2) by ASIC using address port, and become available on data bus () at the end of cycle t4 and during cycle t5. The modifications to this data can be made by ASIC and written back to the fabric in cycle t6. Simultaneously, the port (cell transfer side) can continue to either read or write to the fabric with deselects inserted between the read and write transitions to eliminate corruption of the data. COMPARISON TO BURST SRAM The computer market s extremely high volume usage of the burst SRAM determined there would be multi vendor supply and competitive pricing for the burst SRAMs. For the write once, read many functions of level 2 cache, burst SRAMs were perfect. For the cost and supply reasons, some communications designers have attempted to use them in switch fabric. The common characteristic of burst SRAMs is that they have one address port and a common input/output port. As a result, during one clock cycle, the device can either read an address location or write to an address location. In addition, when the common I/O must be turned from the read to write, a deselect cycle must be inserted into the timing, which results in one wait state. It is even worse for the turning of the bus from write to read since two deselect cycles must be inserted to make sure there are no conflicts on the common I/O bus. This translates to a utilization of the clock cycles to data in or out, of between 50 to 70%, depending on the pattern used to access the data in the fabric. For this reason, the burst SRAM parts are not efficient for the write/read/write cycles of the communications market, and this limitation becomes more critical for the gigabit and terabit bandwidths of switches and switch/ routers. NetRAM can be configured as a separate I/O device or a dual I/O device which avoids the overall system performance penalties imposed by the need to insert deselect cycles with a common I/O burst SRAM. 6

nc. PORT X (CELL TRANSFER SIDE) K t1 t2 t3 t4 t5 t6 t7 t8 t9 WRITE READ WRITE READ 2 3 6 7 10 WX GX PORT Y (SNOOP SIDE) WY GY D(2) Q(3) D(6) Q(7) D(10) READ 2 2 6 6 SNOOP Q(2) WRITE D(2) READ Q(6) D(6) MODIFY SNOOP MODIFY Figure 7. Timing for Snoop Fabric Implementation 7

nc. SYSTEM PERFORMANCE ADVANTAGES OF NetRAM NetRAM has the ability to do reads and writes in the same clock cycle with separate addresses for each port, empowering the designer to create higher speed switch designs than would be possible with a conventional SRAM. Even with the dramatic increases in system frequency and speed that the burst SRAMs have made available, the designer is still not able to get the throughput that is possible with the dual port NetRAM product while maintaining lower system frequencies. The communications designer needs to be able to write/ read/write a great deal of the time and the burst SRAMs were created for the burst reads and writes of the computing applications. In applications such as datacom switches, which frequently transition from reads to writes, burstrams perform inefficiently, causing dead bus cycles between operations. This increases design complexity and reduces the overall system performance. In addition, the dual port feature of the NetRAM enables the designer to simultaneously read and write to different addresses in each clock cycle. Even with recently announced products that read and write in back to back cycles without deselects, the throughput would be one half that of the NetRAM. CONCLUSION The first members of Motorola s NetRAM family will be available to meet your high performance communications system memory requirements. The MCM69D536 is a 3.3 volt synchronous 32K x 36 dual port SRAM with pipelined reads. This product is also available in the 64K x 18 configuration as the MCM69D618. The NetRAM offers a high performance memory solution with frequencies as high as 83 MHz and clock to data access times of 6 ns. This device also offers the flexibility of being used as either a dual I/O or a separate I/O device. The various configurations of the NetRAM are shown below. Product Family Configurations Part Number Dual Address Single Address Dual I/O Separate I/O MCM69D536 Note 1 Note 2 MCM69D618 Note 1 Note 2 NOTES: 1. Tie and address ports together for the part to function as a single address part. 2. Tie GX high and WX low for to be inputs and tie WY high and GY low for to be outputs.