Memory Architecture and Management in a NoC Platform

Size: px
Start display at page:

Download "Memory Architecture and Management in a NoC Platform"

Transcription

1 Architecture and Management in a NoC Platform Axel Jantsch Xiaowen Chen Zhonghai Lu Chaochao Feng Abdul Nameed Yuang Zhang Ahmed Hemani DATE 2011

2 Overview Motivation State of the Art Data Management Engine Architecture Virtual Address Space Cache Coherence Consistency Synchronization Message Passing based Communication Experiments DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 2 / 27

3 The Wall Wulf and McKee predicted in 1995 the impact into the memory wall: t avg = p t c +(1 p) t m t avg...average access latency for one data word p...probability of a cache hit t c...access time to the cache t m...access time to main memory DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 3 / 27

4 The Wall Wulf and McKee predicted in 1995 the impact into the memory wall: t avg = p t c +(1 p) t m t avg...average access latency for one data word p...probability of a cache hit t c...access time to the cache t m...access time to main memory Today we navigate at the edge of this wall. DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 3 / 27

5 The Wall Wulf and McKee predicted in 1995 the impact into the memory wall: t avg = p t c +(1 p) t m t avg...average access latency for one data word p...probability of a cache hit t c...access time to the cache t m...access time to main memory Today we navigate at the edge of this wall. For example the Tilera 64 core chip: Raw processor performance: 443 Gops DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 3 / 27

6 The Wall Wulf and McKee predicted in 1995 the impact into the memory wall: t avg = p t c +(1 p) t m t avg...average access latency for one data word p...probability of a cache hit t c...access time to the cache t m...access time to main memory Today we navigate at the edge of this wall. For example the Tilera 64 core chip: Raw processor performance: 443 Gops Required memory bandwidth with two cache levels and 95% hit rate: 7.2Gb/s Available memory bandwidth: 50 Gb/s DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 3 / 27

7 The Wall Wulf and McKee predicted in 1995 the impact into the memory wall: t avg = p t c +(1 p) t m t avg...average access latency for one data word p...probability of a cache hit t c...access time to the cache t m...access time to main memory Today we navigate at the edge of this wall. For example the Tilera 64 core chip: Raw processor performance: 443 Gops Required memory bandwidth with two cache levels and 95% hit rate: 7.2Gb/s Available memory bandwidth: 50 Gb/s Required memory access latency: cycles Available average access time: cycles DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 3 / 27

8 The Access Bottleneck I/O Bottleneck Processing DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 4 / 27

9 Bandwidth in 3D ICs Embedded Processor DRAM Die Processing Die (NoC) DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 5 / 27

10 Bandwidth in 3D ICs Processor DRAM Die Processing Die (NoC) Embedded 8x8 MPSoC Resource size: 2x2 mm2 Switch size: 100x100 um2 TSV pitch: 5 um TSVs/switch: 256 Gb/s memory bandwidth per core 16 Tb/s aggregate memory bandwidth (200 Gb/s for Tilera) < 10 ns delay DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 5 / 27

11 Bandwidth in 3D ICs Processor DRAM Die Processing Die (NoC) Embedded 8x8 MPSoC Resource size: 2x2 mm2 Switch size: 100x100 um2 TSV pitch: 5 um TSVs/switch: 256 Gb/s memory bandwidth per core 16 Tb/s aggregate memory bandwidth (200 Gb/s for Tilera) < 10 ns delay 80 x bandwidth 1/5 latency 1/10 power No area overhead DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 5 / 27

12 Access Protocols Protocol Characteristics Frequency buswidth Voltage max bandiwtdh Efficiency DDR MHz 32 bit 1.5 V GB/s 4.17 GB/s/W LPDDR2 533 MHz 32 bit 1.2 V GB/s 6.25 GB/s/W Wide I/O 200 MHz 512 bit 1.2 V 12.8 GB/s 25.0 GB/s/W Cost for 1 TB/s Protocol No of ports No of pads Power DDR W LPDDR W Wide I/O W DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 6 / 27

13 Package Architecture Roadmap 3D with TSVs POP with Stacked MCP Stacked MCP Off-Package From: Denis Dutois and Ahmed Jerraya. 3D Integration Opportunities for Interconnect in Mobile Computing Architectures. In: Future Fab International 34 (July 2010) DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 7 / 27

14 Architecture State of the Art Custom memory architectures in many SoCs and embedded systems with no shared memory space No HW support for shared memory space, cache coherence and consistency (e.g. Inte s 48 core SCC) Uniform shared memory space with snooping based cache coherence for small multi-core systems (ARM s Cortex A9) Non-Uniform Cache Architecture (NUCA) in regular many-core SoCs; introduced in 2002: C. Kim, D. Burger, and S. Keckler. An Adaptive, Non-uniform Cache Structure for Wire-Delay Dominated On-Chip Caches. In: Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems No general, flexible, and scalable solution for the many-core era DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 8 / 27

15 Platform Overview DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 9 / 27

16 Data Management Engine Architecture DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 10 / 27

17 Virtual Address Space Translation Page Address Offset Virtual Address Address Translation Table Node Address Page Address Physical Address Offset DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 11 / 27

18 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

19 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

20 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

21 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node GB total memory DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

22 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node GB total memory Page size 1KB DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

23 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node GB total memory Page size 1KB Translation table: 1 M entries DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

24 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node GB total memory Page size 1KB Translation table: 1 M entries entry: 6+14=20 bit 3 Byte DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

25 Virtual Address Space Translation S T = (log 2 N +log 2 M N log 2 P S ) (M T /P S ) S T... Size of translation table per node N... Number of nodes M N... per node P S... Page size M T... Total memory size 64 nodes MB/node GB total memory Page size 1KB Translation table: 1 M entries entry: 6+14=20 bit 3 Byte 3MB/node (18.75%) for translation table DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 12 / 27

26 DME Virtual Address Translation Page Address Offset Virtual Address NO > BADDR Page Address Offset YES Address Translation Table Node Address Page Address Offset Physical Address DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 13 / 27

27 Supported Partitions local private physical Supported local private virtual - local shared physical - local shared virtual Supported remote private physical - remote private virtual - remote shared physical - remote shared virtual Supported DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 14 / 27

28 Address Space Management Node 1 Address Space Node 2 Address Space Node 3 Address Space Node 4 Address Space 0X X BADDR 0X X Private, Local local Private, Local local remote BADDR 0X0F00000 local Shared, Local or local Private, Local remote Shared, Local or remote 0XFB00000 BADDR 0XFFF0000 local remote Shared, Local or remote remote 0XFFFC000 local 0XFFFFFFFF 0XFFFFFFFF 0XFFFFFFFF BADDR 0XFFFFFFFF DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 15 / 27

29 Multi-core Speedup DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 16 / 27

30 Experiment: Central vs. Distributed Shared DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 17 / 27

31 Central vs. Distributed Shared : Matrix Multiplication DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 18 / 27

32 Central vs. Distributed Shared : FFT DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 19 / 27

33 Summary After computation (multi-core) and communication (NoC), memory access must be parallalized DME parallizes memory handling DME supports Central/distributed memory Private/shared Physical/virtual address space DME features Synchronization support Cache coherence protocols consistency support Message passing communication Dynamic memory allocator Future Work Improve cache coherency Improved high level support for programming models Adapt DME to integrate heterogeneous SoC subsystems DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 20 / 27

34 Data Management Engine Implementation Optimized for area Optimized for speed Frequency 444 MHz (2.25 ns) 455 MHz (2.2 ns) Area (Logic) 44k NAND gates 51k NAND gates Area (Control Store) 300k NAND gates power consumption [mw] Mini A 6.9 Mini B 7.0 NICU 2.3 CICU 5.2 Synchronizer 0.2 DME total 21.6 DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 21 / 27

35 Directory Based Cache Coherence Processor Node Processor Node Processor Node Processor Node Cache Cache Cache Cache Processor Node Processor Node Processor Node Processor Node Cache Cache Cache Cache Processor Node Processor Node Processor Node Controller Cache Cache Cache Directory DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 22 / 27

36 Directory Structure Node 1 Node 2 Node 3 Directory Node N block DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 23 / 27

37 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

38 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

39 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

40 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory Block size 64B DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

41 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory Block size 64B Cache size 8MB DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

42 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory Block size 64B Cache size 8MB No of blocks/cache: 128K DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

43 Directory Size 1 entry / memory block Node 1 Node 2 Node 3 Directory Node N block S D = S T /S B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory Block size 64B Cache size 8MB No of blocks/cache: 128K S D = 520MB 13% of total memory 102% of total cache size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 24 / 27

44 Directory Size 1 entry / cache block Node 1 Node 2 Node 3 Directory Node N Cache block Cache 1 Cache 2 Cache 3 S D = N C B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 25 / 27

45 Directory Size 1 entry / cache block Node 1 Node 2 Node 3 Directory Node N Cache block Cache 1 Cache 2 Cache 3 S D = N C B (N +1) S D... Size of directory S T... Total memory N... No of nodes C B... blocks/cache S B... Block size S C... Cache size 64 nodes GB total memory Block size 64B Cache size 8MB No of blocks/cache: 128K S D = 65MB 1.5% of total memory 12.6% of total cache size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 25 / 27

46 Consistency Experiment Initially, int x, y,z = 0; (0,0) CS NODE (0,1) (0,2) // Non-critical section1 x = data1; Reg1 = x; // Lock Acquire Acquire (L); (1,0) (1,1) SYNC NODE (1,2) // Critical Section y = data2; Reg2 = y; // Lock Release Release(L) (2,0) (2,1) (2,2) (a) // Non-critical Section2 z = data3; Reg3 = z; (b) DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 26 / 27

47 Consistency Execution Time 10 x Average Code Latency (Release consistency) Max Code Latency (Release consistency) Average Code Latency (Weak consistency) Max Code Latency (Weak consistency) Average Code Latency (Sequential consistency) Max Code Latency (Sequential consistency) Cycles x1 1x2 2x2 2x4 4x4 4x8 8x8 Network Size DATE 2011 Tutorial MPSoC Hardware/Software Architectural and Design Challenges/Solutions, Axel Jantsch, KTH 27 / 27

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Putting it all together: Intel Nehalem http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Intel Nehalem Review entire term by looking at most recent microprocessor from Intel Nehalem is code

More information

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant

More information

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip Outline Modeling, simulation and optimization of Multi-Processor SoCs (MPSoCs) Università of Verona Dipartimento di Informatica MPSoCs: Multi-Processor Systems on Chip A simulation platform for a MPSoC

More information

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based

More information

Lecture 23: Multiprocessors

Lecture 23: Multiprocessors Lecture 23: Multiprocessors Today s topics: RAID Multiprocessor taxonomy Snooping-based cache coherence protocol 1 RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks

More information

ICRI-CI Retreat Architecture track

ICRI-CI Retreat Architecture track ICRI-CI Retreat Architecture track Uri Weiser June 5 th 2015 - Funnel: Memory Traffic Reduction for Big Data & Machine Learning (Uri) - Accelerators for Big Data & Machine Learning (Ran) - Machine Learning

More information

A Look Inside Smartphone and Tablets

A Look Inside Smartphone and Tablets A Look Inside Smartphone and Tablets Devices and Trends John Scott-Thomas TechInsights Semicon West July 9, 2013 Teardown 400 phones and tablets a year Four areas: Customer Focus Camera Display Manufacturer

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Kalray MPPA Massively Parallel Processing Array

Kalray MPPA Massively Parallel Processing Array Kalray MPPA Massively Parallel Processing Array Next-Generation Accelerated Computing February 2015 2015 Kalray, Inc. All Rights Reserved February 2015 1 Accelerated Computing 2015 Kalray, Inc. All Rights

More information

Scaling Mobile Compute to the Data Center. John Goodacre

Scaling Mobile Compute to the Data Center. John Goodacre Scaling Mobile Compute to the Data Center John Goodacre Director Technology and Systems, ARM Ltd. Cambridge Professor Computer Architectures, APT. Manchester EuroServer Project EUROSERVER is a European

More information

1 Storage Devices Summary

1 Storage Devices Summary Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious

More information

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Cristina SILVANO silvano@elet.polimi.it Politecnico di Milano, Milano (Italy) Talk Outline

More information

Accelerating Server Storage Performance on Lenovo ThinkServer

Accelerating Server Storage Performance on Lenovo ThinkServer Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER

More information

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Dirk Schmidl, Tim Cramer, Sandra Wienke, Christian Terboven, and Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum

More information

Introducing the Singlechip Cloud Computer

Introducing the Singlechip Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing) 1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication

More information

- Nishad Nerurkar. - Aniket Mhatre

- Nishad Nerurkar. - Aniket Mhatre - Nishad Nerurkar - Aniket Mhatre Single Chip Cloud Computer is a project developed by Intel. It was developed by Intel Lab Bangalore, Intel Lab America and Intel Lab Germany. It is part of a larger project,

More information

Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards

Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate Nytro Flash Accelerator Cards Technology Paper Authored by: Mark Pokorny, Database Engineer, Seagate Overview SQL Server 2014 provides

More information

3D-IC Integration. Developments. Cooperation for servicing and MPW runs offering. CMP annual users meeting, 25 January 2012, PARIS.

3D-IC Integration. Developments. Cooperation for servicing and MPW runs offering. CMP annual users meeting, 25 January 2012, PARIS. 3D-IC Integration Developments Cooperation for servicing and MPW runs offering Agenda Introduction Process overview Partnership for MPW runs service 3D-IC Design Platform First MPW run Conclusion 3D-IC

More information

Storage is in Our DNA. Industry-leading 12Gb/s and 6Gb/s Storage Solutions RAID Adapters, Host Bus Adapters, and SAS Expander

Storage is in Our DNA. Industry-leading 12Gb/s and 6Gb/s Storage Solutions RAID Adapters, Host Bus Adapters, and SAS Expander Storage is in Our DNA Industry-leading 12Gb/s and 6Gb/s Storage Solutions RAID Adapters, Host Bus Adapters, and SAS Expander 12Gb/s PCIe Gen3 RAID Adapters Series 8 and Series 8Q The Series 8 and Series

More information

Scaling Hadoop for Multi-Core and Highly Threaded Systems

Scaling Hadoop for Multi-Core and Highly Threaded Systems Scaling Hadoop for Multi-Core and Highly Threaded Systems Jangwoo Kim, Zoran Radovic Performance Architects Architecture Technology Group Sun Microsystems Inc. Project Overview Hadoop Updates CMT Hadoop

More information

Current Status of FEFS for the K computer

Current Status of FEFS for the K computer Current Status of FEFS for the K computer Shinji Sumimoto Fujitsu Limited Apr.24 2012 LUG2012@Austin Outline RIKEN and Fujitsu are jointly developing the K computer * Development continues with system

More information

Thread Level Parallelism (TLP)

Thread Level Parallelism (TLP) Thread Level Parallelism (TLP) Calcolatori Elettronici 2 http://www.dii.unisi.it/~giorgi/didattica/calel2 TLP: SUN Microsystems vision (2004) Roberto Giorgi, Universita di Siena, C208L15, Slide 2 Estimated

More information

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab FPGA Accelerator Virtualization in an OpenPOWER cloud Fei Chen, Yonghua Lin IBM China Research Lab Trend of Acceleration Technology Acceleration in Cloud is Taking Off Used FPGA to accelerate Bing search

More information

Improving the performance of data servers on multicore architectures. Fabien Gaud

Improving the performance of data servers on multicore architectures. Fabien Gaud Improving the performance of data servers on multicore architectures Fabien Gaud Grenoble University Advisors: Jean-Bernard Stefani, Renaud Lachaize and Vivien Quéma Sardes (INRIA/LIG) December 2, 2010

More information

G22.3250-001. Porcupine. Robert Grimm New York University

G22.3250-001. Porcupine. Robert Grimm New York University G22.3250-001 Porcupine Robert Grimm New York University Altogether Now: The Three Questions! What is the problem?! What is new or different?! What are the contributions and limitations? Porcupine from

More information

White Paper The Numascale Solution: Extreme BIG DATA Computing

White Paper The Numascale Solution: Extreme BIG DATA Computing White Paper The Numascale Solution: Extreme BIG DATA Computing By: Einar Rustad ABOUT THE AUTHOR Einar Rustad is CTO of Numascale and has a background as CPU, Computer Systems and HPC Systems De-signer

More information

numascale White Paper The Numascale Solution: Extreme BIG DATA Computing Hardware Accellerated Data Intensive Computing By: Einar Rustad ABSTRACT

numascale White Paper The Numascale Solution: Extreme BIG DATA Computing Hardware Accellerated Data Intensive Computing By: Einar Rustad ABSTRACT numascale Hardware Accellerated Data Intensive Computing White Paper The Numascale Solution: Extreme BIG DATA Computing By: Einar Rustad www.numascale.com Supemicro delivers 108 node system with Numascale

More information

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview Technical Note TN-29-06: NAND Flash Controller on Spartan-3 Overview Micron NAND Flash Controller via Xilinx Spartan -3 FPGA Overview As mobile product capabilities continue to expand, so does the demand

More information

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING José Daniel García Sánchez ARCOS Group University Carlos III of Madrid Contents 2 The ARCOS Group. Expand motivation. Expand

More information

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy The Quest for Speed - Memory Cache Memory CSE 4, Spring 25 Computer Systems http://www.cs.washington.edu/4 If all memory accesses (IF/lw/sw) accessed main memory, programs would run 20 times slower And

More information

Important Differences Between Consumer and Enterprise Flash Architectures

Important Differences Between Consumer and Enterprise Flash Architectures Important Differences Between Consumer and Enterprise Flash Architectures Robert Sykes Director of Firmware Flash Memory Summit 2013 Santa Clara, CA OCZ Technology Introduction This presentation will describe

More information

PowerLinux introduction

PowerLinux introduction PowerLinux introduction Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 9.0 Unit objectives After completing this unit, you should be able to: Recognize

More information

Intel Itanium Architecture

Intel Itanium Architecture Intel Itanium Architecture Roadmap and Technology Update Dr. Gernot Hoyler Technical Marketing EMEA Intel Itanium Architecture Growth MARKET Over 3x revenue growth Y/Y* More than 10x growth* in shipments

More information

Optimizing Parallel Reduction in CUDA. Mark Harris NVIDIA Developer Technology

Optimizing Parallel Reduction in CUDA. Mark Harris NVIDIA Developer Technology Optimizing Parallel Reduction in CUDA Mark Harris NVIDIA Developer Technology Parallel Reduction Common and important data parallel primitive Easy to implement in CUDA Harder to get it right Serves as

More information

YarcData urika Technical White Paper

YarcData urika Technical White Paper YarcData urika Technical White Paper 2012 Cray Inc. All rights reserved. Specifications subject to change without notice. Cray is a registered trademark, YarcData, urika and Threadstorm are trademarks

More information

Annotation to the assignments and the solution sheet. Note the following points

Annotation to the assignments and the solution sheet. Note the following points Computer rchitecture 2 / dvanced Computer rchitecture Seite: 1 nnotation to the assignments and the solution sheet This is a multiple choice examination, that means: Solution approaches are not assessed

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,

More information

AN EFFICIENT DESIGN OF LATCHES FOR MULTI-CLOCK MULTI- MICROCONTROLLER SYSTEM ON CHIP FOR BUS SYNCHRONIZATION

AN EFFICIENT DESIGN OF LATCHES FOR MULTI-CLOCK MULTI- MICROCONTROLLER SYSTEM ON CHIP FOR BUS SYNCHRONIZATION N EFFICIENT ESIGN OF LTCHES FOR MULTI-CLOCK MULTI- MICROCONTROLLER SYSTEM ON CHIP FOR US SYNCHRONIZTION noop Kumar Vishwakarma 1, Neerja Singh 2 1 Student (M.Tech.), ECE, ES Engineering College Ghaziabad,

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Enabling the Use of Data

Enabling the Use of Data Enabling the Use of Data Michael Kagan, CTO June 1, 2015 - Technion Computer Engineering Conference Safe Harbor Statement These slides and the accompanying oral presentation contain forward-looking statements

More information

High Performance Computing in the Multi-core Area

High Performance Computing in the Multi-core Area High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable

More information

Desktop Processor Roadmap. Solution Provider Accounts

Desktop Processor Roadmap. Solution Provider Accounts Desktop Processor Roadmap Solution Provider Accounts August 2008 Desktop Division Roadmap Changes since July 2008 Additions Energy-efficient Brisbane 5050e processor to launch in Q408 Desktop Processors

More information

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010 Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,

More information

Table Of Contents. Page 2 of 26. *Other brands and names may be claimed as property of others.

Table Of Contents. Page 2 of 26. *Other brands and names may be claimed as property of others. Technical White Paper Revision 1.1 4/28/10 Subject: Optimizing Memory Configurations for the Intel Xeon processor 5500 & 5600 series Author: Scott Huck; Intel DCG Competitive Architect Target Audience:

More information

32 K. Stride 8 32 128 512 2 K 8 K 32 K 128 K 512 K 2 M

32 K. Stride 8 32 128 512 2 K 8 K 32 K 128 K 512 K 2 M Evaluating a Real Machine Performance Isolation using Microbenchmarks Choosing Workloads Evaluating a Fixed-size Machine Varying Machine Size Metrics All these issues, plus more, relevant to evaluating

More information

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND -Based SSDs Sangwook Shane Hahn, Sungjin Lee, and Jihong Kim Department of Computer Science and Engineering, Seoul National University,

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: Embedded Systems - , Raj Kamal, Publs.: McGraw-Hill Education Lesson 7: SYSTEM-ON ON-CHIP (SoC( SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY 1 VLSI chip Integration of high-level components Possess gate-level sophistication in circuits above that of the counter,

More information

VI Performance Monitoring

VI Performance Monitoring VI Performance Monitoring Preetham Gopalaswamy Group Product Manager Ravi Soundararajan Staff Engineer September 15, 2008 Agenda Introduction to performance monitoring in VI Common customer/partner questions

More information

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang Distributed RAID Architectures for Cluster I/O Computing Kai Hwang Internet and Cluster Computing Lab. University of Southern California 1 Presentation Outline : Scalable Cluster I/O The RAID-x Architecture

More information

SiS AMD Athlon TM 64FX/PCI-E Solution. Silicon Integrated Systems Corp. Integrated Product Division April. 2004

SiS AMD Athlon TM 64FX/PCI-E Solution. Silicon Integrated Systems Corp. Integrated Product Division April. 2004 SiS AMD Athlon TM 64FX/PCI-E Solution Silicon Integrated Systems Corp. Integrated Product Division April. 2004 Agenda SiS AMD Athlon 64/ 64FX Product Roadmap SiS756 Architecture Advanced Feature of SiS756

More information

QorIQ Advanced Multiprocessing (AMP) Series freescale.com

QorIQ Advanced Multiprocessing (AMP) Series freescale.com QorIQ Advanced Multiprocessing (AMP) Series Delivers More than Moore Freescale s new QorIQ AMP series pushes the compute and energy performance envelope beyond the P4080 processor such that its performance

More information

SRAM Scaling Limit: Its Circuit & Architecture Solutions

SRAM Scaling Limit: Its Circuit & Architecture Solutions SRAM Scaling Limit: Its Circuit & Architecture Solutions Nam Sung Kim, Ph.D. Assistant Professor Department of Electrical and Computer Engineering University of Wisconsin - Madison SRAM VCC min Challenges

More information

Exploiting Transparent Remote Memory Access for Non-Contiguous- and One-Sided-Communication

Exploiting Transparent Remote Memory Access for Non-Contiguous- and One-Sided-Communication Workshop for Communication Architecture in Clusters, IPDPS 2002: Exploiting Transparent Remote Memory Access for Non-Contiguous- and One-Sided-Communication Joachim Worringen, Andreas Gäer, Frank Reker

More information

W H I T E P A P E R. VMware Infrastructure Architecture Overview

W H I T E P A P E R. VMware Infrastructure Architecture Overview W H I T E P A P E R ware Infrastructure Architecture Overview ware white paper Table of Contents Physical Topology of the ware Infrastructure Data Center............................... 4 Virtual Data Center

More information

Internet-in-a-Box: Emulating Datacenter Network Architectures using FPGAs

Internet-in-a-Box: Emulating Datacenter Network Architectures using FPGAs Internet-in-a-Box: Emulating Datacenter Network Architectures using FPGAs Jonathan Ellithorpe Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of

More information

High Throughput Computing Data Center Architecture Thinking of Data Center 3.0

High Throughput Computing Data Center Architecture Thinking of Data Center 3.0 Technical White Paper High Throughput Computing Data Center Architecture Thinking of Data Center 3.0 Abstract In the last few decades, data center (DC) technologies have kept evolving from DC 1.0 (tightly-coupled

More information

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation Itanium 2 Platform and Technologies Alexander Grudinski Business Solution Specialist Intel Corporation Intel s s Itanium platform Top 500 lists: Intel leads with 84 Itanium 2-based systems Continued growth

More information

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect High-Performance SSD-Based RAID Storage Madhukar Gunjan Chakhaiyar Product Test Architect 1 Agenda HDD based RAID Performance-HDD based RAID Storage Dynamics driving to SSD based RAID Storage Evolution

More information

Providing Battery-Free, FPGA-Based RAID Cache Solutions

Providing Battery-Free, FPGA-Based RAID Cache Solutions Providing Battery-Free, FPGA-Based RAID Cache Solutions WP-01141-1.0 White Paper RAID adapter cards are critical data-center subsystem components that ensure data storage and recovery during power outages.

More information

Accelerating the Data Plane With the TILE-Mx Manycore Processor

Accelerating the Data Plane With the TILE-Mx Manycore Processor Accelerating the Data Plane With the TILE-Mx Manycore Processor Bob Doud Director of Marketing EZchip Linley Data Center Conference February 25 26, 2015 1 Announcing the World s First 100-Core A 64-Bit

More information

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU Efficient Parallel Graph Exploration on Multi-Core CPU and GPU Pervasive Parallelism Laboratory Stanford University Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun Graph and its Applications Graph Fundamental

More information

D1.2 Network Load Balancing

D1.2 Network Load Balancing D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl,

More information

Introducción. Diseño de sistemas digitales.1

Introducción. Diseño de sistemas digitales.1 Introducción Adapted from: Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg431 [Original from Computer Organization and Design, Patterson & Hennessy, 2005, UCB] Diseño de sistemas digitales.1

More information

YALES2 porting on the Xeon- Phi Early results

YALES2 porting on the Xeon- Phi Early results YALES2 porting on the Xeon- Phi Early results Othman Bouizi Ghislain Lartigue Innovation and Pathfinding Architecture Group in Europe, Exascale Lab. Paris CRIHAN - Demi-journée calcul intensif, 16 juin

More information

HMI EMBEDDED SYSTEM DESIGN AS A FUNCTION OF TECU

HMI EMBEDDED SYSTEM DESIGN AS A FUNCTION OF TECU HMI EMBEDDED SYSTEM DESIGN AS A FUNCTION OF TECU Katrenčík J., Čupera J., Fajman M. Department of Technology and Automobile Transport, Faculty of Agronomy, Mendel University in Brno, Zemedelska 1, 613

More information

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 December 2014 FPGAs in the news» Catapult» Accelerate BING» 2x search acceleration:» ½ the number of servers»

More information

SQL Server 2008 Performance and Scale

SQL Server 2008 Performance and Scale SQL Server 2008 Performance and Scale White Paper Published: February 2008 Updated: July 2008 Summary: Microsoft SQL Server 2008 incorporates the tools and technologies that are necessary to implement

More information

Datasheet. High-Performance airmax Bridge. Models: NBE M5-19, NBE-M5-16. Uniform Beamwidth Maximizes Noise Immunity. Innovative Mechanical Design

Datasheet. High-Performance airmax Bridge. Models: NBE M5-19, NBE-M5-16. Uniform Beamwidth Maximizes Noise Immunity. Innovative Mechanical Design High-Performance airmax Bridge Models: NBE M5-19, NBE-M5-16 Uniform Beamwidth Maximizes Noise Immunity Innovative Mechanical Design High-Speed Processor for Superior Performance Overview Application Examples

More information

Benchmarking Guide. Performance. BlackBerry Enterprise Server for Microsoft Exchange. Version: 5.0 Service Pack: 4

Benchmarking Guide. Performance. BlackBerry Enterprise Server for Microsoft Exchange. Version: 5.0 Service Pack: 4 BlackBerry Enterprise Server for Microsoft Exchange Version: 5.0 Service Pack: 4 Performance Benchmarking Guide Published: 2015-01-13 SWD-20150113132750479 Contents 1 BlackBerry Enterprise Server for Microsoft

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

Real-Time Operating Systems for MPSoCs

Real-Time Operating Systems for MPSoCs Real-Time Operating Systems for MPSoCs Hiroyuki Tomiyama Graduate School of Information Science Nagoya University http://member.acm.org/~hiroyuki MPSoC 2009 1 Contributors Hiroaki Takada Director and Professor

More information

MirrorBit Technology: The Foundation for Value-Added Flash Memory Solutions FLASH FORWARD

MirrorBit Technology: The Foundation for Value-Added Flash Memory Solutions FLASH FORWARD MirrorBit Technology: The Foundation for Value-Added Flash Memory Solutions FLASH FORWARD MirrorBit Technology: The Future of Flash Memory is Here Today Spansion is redefining the Flash memory industry

More information

Enterprise Flash Drive

Enterprise Flash Drive Enterprise Flash Drive A evolução do armazenamento de dados Tutorial sobre as tecnonologias SSD (Solid State Disk) e sua implantação pela engenharia da EMC, denominado EFD (Enterprise Flash Drive), onde

More information

Tipos de Máquinas. 1. Overview - 7946 x3550 M2. Features X550M2

Tipos de Máquinas. 1. Overview - 7946 x3550 M2. Features X550M2 Tipos de Máquinas X550M2 1. Overview - 7946 x3550 M2 IBM System x3550 M2 Machine Type 7946 is a follow on to the IBM System x3550 M/T 7978. The x3550 M2 is a self-contained, high performance, rack-optimized,

More information

Technical Note DDR2 Offers New Features and Functionality

Technical Note DDR2 Offers New Features and Functionality Technical Note DDR2 Offers New Features and Functionality TN-47-2 DDR2 Offers New Features/Functionality Introduction Introduction DDR2 SDRAM introduces features and functions that go beyond the DDR SDRAM

More information

Performance Workload Design

Performance Workload Design Performance Workload Design The goal of this paper is to show the basic principles involved in designing a workload for performance and scalability testing. We will understand how to achieve these principles

More information

Optics Technology Trends in Data Centers

Optics Technology Trends in Data Centers Optics Technology Trends in Data Centers Marc A. Taubenblatt IBM OFC Market Watch 2014 Outline/Summary Data movement/aggregate BW increasing and increasingly important Historically, per channel data rate

More information

Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries

Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries Shin Morishima 1 and Hiroki Matsutani 1,2,3 1Keio University, 3 14 1 Hiyoshi, Kohoku ku, Yokohama, Japan 2National Institute

More information

Motivation: Smartphone Market

Motivation: Smartphone Market Motivation: Smartphone Market Smartphone Systems External Display Device Display Smartphone Systems Smartphone-like system Main Camera Front-facing Camera Central Processing Unit Device Display Graphics

More information

Computer Engineering: MS Program Overview, Fall 2013

Computer Engineering: MS Program Overview, Fall 2013 Computer Engineering: MS Program Overview, Fall 2013 Prof. Steven Nowick (nowick@cs.columbia.edu) Chair, (on sabbatical) Prof. Charles Zukowski (caz@columbia.edu) Acting Chair, Overview of Program The

More information

IOVU-571N ARM-based Panel PC

IOVU-571N ARM-based Panel PC IOVU-571N ARM-based Panel PC Features RISC-based Panel PC IOVU-57N Application Dimensions Ordering Information Specifications ARM-based Panel PC IOVU-571N Serial IOVU software support Packing List Options

More information

An Oracle White Paper August 2012. Oracle WebCenter Content 11gR1 Performance Testing Results

An Oracle White Paper August 2012. Oracle WebCenter Content 11gR1 Performance Testing Results An Oracle White Paper August 2012 Oracle WebCenter Content 11gR1 Performance Testing Results Introduction... 2 Oracle WebCenter Content Architecture... 2 High Volume Content & Imaging Application Characteristics...

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

iomemory Produkt Vorstellung

iomemory Produkt Vorstellung Storage Memory Platform iomemory VSL iosphere ioturbine directcache iomemory Produkt Vorstellung Jens Mertes Wolfgang Bresgen Solution Architect - CE Sales Manager +49 171 225 8 225 +49 160 97332249 jmertes@fusionio.com

More information

Exploring RAID Configurations

Exploring RAID Configurations Exploring RAID Configurations J. Ryan Fishel Florida State University August 6, 2008 Abstract To address the limits of today s slow mechanical disks, we explored a number of data layouts to improve RAID

More information

Microsoft Windows Server Hyper-V in a Flash

Microsoft Windows Server Hyper-V in a Flash Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise-class storage arrays with the ease and flexibility of Windows Storage Server in an integrated solution to achieve higher density,

More information

SIMULATING NON-UNIFORM MEMORY ACCESS ARCHITECTURE

SIMULATING NON-UNIFORM MEMORY ACCESS ARCHITECTURE SIMULATING NON-UNIFORM MEMORY ACCESS ARCHITECTURE FOR CLOUD SERVER APPLICATIONS Joakim Nylund Master of Science Thesis Supervisor: Prof. Johan Lilius Advisor: Dr. Sébastien Lafond Embedded Systems Laboratory

More information

Sitecore Health. Christopher Wojciech. netzkern AG. christopher.wojciech@netzkern.de. Sitecore User Group Conference 2015

Sitecore Health. Christopher Wojciech. netzkern AG. christopher.wojciech@netzkern.de. Sitecore User Group Conference 2015 Sitecore Health Christopher Wojciech netzkern AG christopher.wojciech@netzkern.de Sitecore User Group Conference 2015 1 Hi, % Increase in Page Abondonment 40% 30% 20% 10% 0% 2 sec to 4 2 sec to 6 2 sec

More information

AN EVENT-BASED NETWORK-ON-CHIP MONITORING SERVICE

AN EVENT-BASED NETWORK-ON-CHIP MONITORING SERVICE AN EVENT-BASED NETWOK-ON-CH MOTOING SEVICE Calin Ciordas Twan Basten Andrei ădulescu Kees Goossens Jef van Meerbergen Eindhoven University of Technology, Eindhoven, The Netherlands hilips esearch Laboratories,

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Scalable Internet Services and Load Balancing

Scalable Internet Services and Load Balancing Scalable Services and Load Balancing Kai Shen Services brings ubiquitous connection based applications/services accessible to online users through Applications can be designed and launched quickly and

More information

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

Multi-core Curriculum Development at Georgia Tech: Experience and Future Steps

Multi-core Curriculum Development at Georgia Tech: Experience and Future Steps Multi-core Curriculum Development at Georgia Tech: Experience and Future Steps Ada Gavrilovska, Hsien-Hsin-Lee, Karsten Schwan, Sudha Yalamanchili, Matt Wolf CERCS Georgia Institute of Technology Background

More information