SOC architecture and design

Size: px

Start display at page:

Download "SOC architecture and design"

Rodney Sutton
10 years ago
Views:

1 SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external memory interconnect: buses, network-on-chip impact: time, area, power, reliability, configurability customisability: specialized processors, reconfiguration productivity/tools: model, explore, re-use, synthesise, verify examples: crypto, graphics, media, network, comm, security future: autonomous SOC, self-optimising/verifying design our focus overview, processor, memory wl

configurability customisability: specialized processors, reconfiguration productivity/tools: model, explore, re-use, synthesise, verify examples:

2 iphone SOC Processor I/O I/O 1 GHz ARM Cortex A8 Memory I/O Source: UC Berkeley wl

3 Basic system-on-chip model wl

9 GHz clock rate 65nm technology 3 levels of caches integrated Northbridge

4 2MB shared L3 Cache 512KB L2 512KB L2 512KB L2 512KB L2 AMD s Barcelona Multicore Processor Core 1 Core 2 4 out-of-order cores 1.9 GHz clock rate 65nm technology 3 levels of caches integrated Northbridge Northbridge Core 3 Core 4 wl

5 SOC vs processors on chip with lots of transistors, designs move in 2 ways: complete system on a chip multi-core processors with lots of cache processor System on chip multiple, simple, heterogeneous Processors on chip few, complex, homogeneous cache one level, small 2-3 levels, extensive memory embedded, on chip very large, off chip functionality special purpose general purpose interconnect wide, high bandwidth often through cache power, cost both low both high operation largely stand-alone need other chips wl

level, small 2-3 levels, extensive memory embedded, on chip very large, off chip functionality special purpose general purpose

6 Processor types: overview Processor type Architecture / Implementation approach SIMD Vector VLIW Superscalar Single instruction applied to multiple functional units Single instruction applied to multiple pipelined registers Multiple instructions issued each cycle under compiler control Multiple instructions issued each cycle under hardware control wl

instruction applied to multiple pipelined registers Multiple instructions issued each

7 Processors for SOCs SOC Basic ISA Processor description Freescale c600: signal processing PowerPC Superscalar with vector extension ClearSpeed CSX600: general Proprietary Array processor with 96 processing elements PlayStation 2: gaming ARM VFP11: general MIPS ARM Pipelined with 2 vector coprocessors Configurable vector coprocessor wl

Proprietary Array processor with 96 processing elements PlayStation 2: gaming ARM

8 Sequential and parallel machines basic single stream processors pipelined: overlap operations in basic sequential superscalar: transparent concurrency VLIW: compiler-generated concurrency multiple streams, multiple functional units array processors vector processors multiprocessors wl

transparent concurrency VLIW: compiler-generated concurrency multiple

9 Pipelined processor Instruction #1 IF ID AG DF EX WB Instruction #2 IF ID AG DF EX WB Instruction #3 IF ID AG DF EX WB Instruction #4 Time IF ID AG DF EX WB wl

10 Superscalar and VLIW processors Instruction #1 IF ID AG DF EX WB Instruction #2 IF ID AG DF EX WB Instruction #3 IF ID AG DF EX WB Instruction #4 IF ID AG DF EX WB Instruction #5 IF ID AG DF EX WB Instruction #6 IF ID AG DF EX WB Time wl

DF EX WB Instruction #4 IF ID AG DF EX WB Instruction #5 IF

11 Superscalar VLIW hardware for parallelism control wl

12 Array processors perform op if condition = mask operand can come from neighbour mask op dest sr1 sr2 n PEs, each with memory; neighbour communications one instruction issued to all PEs wl

13 Vector processors vector registers, eg 8 sets x 64 elements x 64 bits vector instructions: VR3 = VR2 VOP VR1 wl

14 Memory addressing: three levels (each segment contains pages for a program/process) wl

15 User view of memory: addressing a program: process address (offset + base + index) virtual address: from page address and process/user id segment table: process base and bound (for each process) system address: process base + page address pages: active localities in main/real memory virtual address: page table lookup to physical address page miss: virtual pages not in page table TLB (translation look-aside buffer): recent translations TLB entry: corresponding real and (virtual, id) address a few hashed virtual address bits address TLB entries if virtual, id = TLB (virtual, id) then use translation wl

table lookup to physical address page miss: virtual pages not in page table TLB (translation look-aside buffer): recent translations TLB entry:

16 TLB and Paging: Address translation Virtual Address (recent translations) (find process) process base System Address (find page) Physical Address wl

17 SOC interconnect interconnecting multiple active agents requires bandwidth: capacity to transmit information (bps) protocol: logic for non-interfering message transmission bus AMBA (Adv. Microcontroller Bus Architecture) from ARM, widely used for SOC bus performance: can determine system performance network on chip array of switches statically switched: eg mesh dynamically switched: eg crossbar wl

Microcontroller Bus Architecture) from ARM, widely used for SOC bus performance: can determine system

18 Design cost: product economics increasingly product cost determined by design costs, including verification not marginal cost to produce manage complexity in die technology by engineering effort engineering cleverness design effort often dictated by product volume Design time and effort Basic physical tradeoffs Balance point depends on n, number of units wl

engineering effort engineering cleverness design effort often dictated by product volume

19 Design complexity processors wl

20 Cost: product program vs engineering Chip design Fixed costs Variable costs Verify & test Labor costs Marketing, sales, administration Manufacturing costs Software CAD support Engineering Engineering costs Mask costs Product cost CAD programs Capital equipment Fixed project costs wl

Manufacturing costs Software CAD support Engineering Engineering costs

21 Example: two scenarios fixed costs K f, support costs 0.1 x function(n), and variable costs K v x n, so design gets more complex, while production costs decrease K f increases while K v decreases if same price, requires higher volumes to break even when compared with 1995, in 2015 K f increased by 10 times K v decreased by the same amount wl

22 More recent: higher NRE wl

23 IP: Intellectual Property wl

24 Answers to Unassessed Coursework 5 1. rdl 1 R = snd [-] -1 ; R rdl n+1 R = snd apr n -1 ; rsh ; fst (rdl n R) ; R 2. P0 = rdl n Pcell; 1 <<s,x>, a> Pcell <sx+a, x> 3. rdl n R = row n (R i ; 2-1 ) ; 2 P1 = loop (row n Pcell1 ; fst map n D) ; 1 <<s,x>, a> Pcell1 <a,<sx+a, x>> 4. loop (row n R) = (loop R) n Proof: induction on n (see P1 = P2 ; [D,D] -n P2 = (loop (Pcell1 ; [D,[D,D]])) n wl

Architectures and Platforms

Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation