Design of an In-Memory Database Engine Using Intel Xeon Phi Coprocessors
|
|
|
- Antony Parks
- 10 years ago
- Views:
Transcription
1 Design of an In-Memory Database Engine Using Intel Xeon Phi Coprocessors Michael Scherger Department of Computer Science exas Christian University ort Worth, X, USA Abstract (PDPA 14) his research presents the design and initial implementation of a database engine using an Intel Xeon Phi co-processor. he many integrated cores (MIC) of the Xeon Phi make this hardware accelerator a natural computing platform for an in-memory database engine or server. he database tables reside in the memory space of the MIC thus supporting fast in-memory database applications. his achieved by developing a coalescing parallel memory manager to allocate parallel variables in the same manner that fields are created in a table using a SQL CREAE ABLE command. he SQL interface was created using a database driver toolkit that provides an interface to the Xeon Phi server and client application. Once the basic framework was established, the algorithms for SQL select, insert, update, delete, and join were created to manipulate database information in the memory of the Xeon Phi. Keywords: parallel databases, parallel hardware accelerators, special purpose architectures 1 Introduction Massively data parallel computers and the SIMD model of parallel computation can be a natural model of parallel computation to consider for massively parallel database servers. Since the cores are extremely close to the parallel memory, fast parallel memory searching make it a natural platform for data parallel computing intensive applications. As described in Potter in [9] and [1], data parallel models of computation such as the SIMD, ASC, or SIDAC model conform to the concept of a parallel database server since the data can logically and physically partitioned similar to the data organization of a database table or spreadsheet [2][8][9][1][11] and [12]. his research paper discusses the initial design of an in-memory database server using a Intel Xeon Phi coprocessors. his research will discuss the design considerations and challenges for a database server and SQL engine that interfaces with the memory of a hardware accelerated data parallel computer. his system design can promote the use of massively parallel computers as database servers for use in embedded database systems, real-time database systems, and fast parallel associative search engines. Database management systems (DBMS) provide a structured mechanism for storing, organizing, and retrieving data in a way that is consistent with the database s format [14]. System software will allow data storage and access to a database without the user s knowledge about the internal data representation either in persistent storage or in the computer s memory. A DBMS usually has but is not limited to the following components [14]: Processors and main memory the hardware of the DBMS for data selection and computation Secondary storage disks for data persistence and offline storage Database manager software for creating and maintaining databases, tables, fields, and relations Utilities software for database maintenance, data integrity and security, and database repair Application development tools software for database application development integrated into the DBMS Report writers software modules for presentations and reports based on tables and queries from database information Design aids software to assist in the design of databases, tables, fields, indexes, and relationships he organization of this research paper is as follows. Section 2 will use the tracking and correlation problem in air traffic control as a motivating example. Section 3 present an overview of the Intel Xeon Phi coprocessor and system software. Section 4 will present the hardware and physical design of the database server including the mapping of table records into the memory of the parallel computer. Section 5 will discuss the techniques of sequential and parallel database query processing. Section 6 will present the system software design of the parallel SQL engine and the algorithms for
2 the basic database server operations. Section 7 will discuss the conclusions and future research. 2 Example Application: AC racking and Correlation Consider a real world and real time application of air traffic control. he following example is an extremely simplified version of the air-traffic tracking and control problem, but provides enough detail to illustrate the system software design for a parallel database server and SQL interface [11]. Some of the basic tasks of air traffic control are: racking and correlation he radar will generate reports of flights of returns that must be correlated to tracks of flights currently in the system. Conflict detection he computer system must then determine if there are any tracks/flights that will conflict/collide with a time look ahead of a predetermined number of minutes or miles. light plan update Based on the tracking and correlation information and combining it with the conflict detection, the flight plan information will be updated. here are many other important tasks in air traffic control, but this is an example of a real-time processing problem [6],[7], and [8]. here are also hard deadlines for computation imposed on the above tasks. hey must be completed prior to the next hard deadline in this real time system. he flight plans and tracks from radar can be stored in a simplified tabular format (flat table) in a database table similar to illustration in igure 1. Insert a new flight into the table. As aircraft enter the airspace, they need to be stored into the flight table. his involves searching for an open/free record in the table and then copying the flight information into that newly created record. Deleting a flight from the table. As aircraft leave the airspace, they need to be deleted from the flight table. his involves searching for the record of the flight to delete and marking that the record is inactive. Selecting a flight from the table. Selection involves identifying one or more flights for further processing. he selection must scan the data in the fields for this table and then return that information back for further processing. Updating the flight information. Updating a flight begins with a search followed by a copy of new information into the selected record from the track information. Each of these frequent operations (insert, select, update, delete) requires some type of a parallel memory search. In the case of insert, the search operation is for an open record in the table. In the case of the select, update, and delete operations, the search required is based on the data stored in the records of the table. his is a contextual search or associative based search. he most common method to improve search performance in a database server is to use index tables. his is illustrated in the igure 2. IDX-1 IDX-2 IDX-3 AID CO56 AA123 UA722 LA AA LON AL HDG AS AID LA LON AL HDG AS CO56 AA CA UA722 AA igure 2: light table illustrating the use of index tables to improve performance. CA igure 1: Sample database to store flight plan information. he AC system software will/may have to perform the following operations when receiving a new set of track information. An index table stores the record indexes based on some ordering criteria or sorting functions. In igure 2, an index table may store pointers to the indices for the aircraft sorted by aircraft id. Another index may store pointers to indices for the aircraft based on altitude. inally another index may store pointers to the indices for the aircraft based on airspeed. In theory, a database table can have one or more indices for each field. However, this dramatically reduces the performance of the insert, update, and delete operations at the benefit of doing fast searches [1][3][4][5]. As new records (flights) are entered into the
3 table, the index tables need to be updated and maintain their sorted order. he constant resorting of each index table becomes increasingly computational demanding. he same is true for the delete and update operations when the flight information changes. he performance degradation is further amplified when multiple index tables must change. 3 Overview of the Intel Xeon Phi he Intel Xeon Phi co-processors have 6 in-order Intel MIC architecture cores running at 1 GHz. he Intel MIC architecture is based on the x86 ISA, extended with 64-bit addressing and 512-bit wide SIMD vector instructions and registers. Each core supports 4 hardware threads. In addition to the cores, there are multiple on-die memory controllers and other components. As shown in igure 3, each core has a newly designed Vector Processing Unit (VPU). Each VPU unit contains bit vector registers. o support the new vector processing model, a new 512-bit SIMD ISA was introduced. he VPU is a key feature of the Intel MIC architecture based cores. ully utilizing the vector unit is critical the best performance. he Intel MIC architecture cores do not support other SIMD ISA s such as MMX, SSE, or AVX. igure 4: Logical MIC core layout and ring communication bus. 4 Database Engine Hardware Design and Architecture here are a few assumptions regarding the design of the parallel database server [11]. 1. he database, tables, and records in the parallel database server are memory resident. Storage is completely volatile and there is no persistent storage in the cells or array memory implemented in this design. or real-time computation, storing data and record information in in secondary storage is costly in terms of access time. Having the data reside in memory, close to the processing elements is more conducive for real-time applications. 2. he data parallel memory map is similar to the field layout planned of a database table. If ABLE_A has fields 1, 2, 3 created in that order, then the parallel memory map will have parallel variables 1 [$], 2 [$], and 3 [$] located in lower to higher parallel memory addresses. igure 3: Intel Xeon Phi MIC core block diagram. Each core has a 32KB L1 data cache, a 32KB L1 instruction cache, and a 512KB L2 cache. As shown in igure 4, he L2 caches of all cores are interconnected with each other and the memory controllers via a bidirectional ring bus, that effectively creates a shared last-level cache of up to 32 MB. he design of each core includes a short in-order pipeline. here is no latency in executing scalar operations and low latency in executing vector operations. Since the in-order pipeline is short, the overhead for branch misprediction is low. 3. he number of actual processing elements is fixed during the execution of the parallel database server. his is not an unrealistic assumption since the Intel Xeon Phi has a fixed number of cores (or hyperthread processors). 4. he amount of memory per processing element is fixed. Again, the memory in the Intel Xeon Phi separate parallel memory space than the memory of the host computer. Albeit the parallel memory space is often smaller than the host memory, for most database applications, the amount of parallel storage is adequate. Since this model is using massively parallel search and responder processing as a model of data parallelism,
4 database index tables are no longer required. Each database field (column) can be searched for the desired value in constant time. Data parallelism can also support efficient software for associative searches. he cores, or processing elements () of the Intel Xeon Phi will be used to assist in the basic database operations and searching. his is illustrated in igure 5. In this figure, the database table is superimposed on the memory and processing elements of a SIMD computer. wo additional fields have been prefixed to the table: a busy-idle flag to indicate if the or record is active and a responder flag used for search operations. Using this approach, each individual record is located in the memory of a. Using massive parallel searching, processing elements can scan their individual memories and set the responder flag or turn their busy-idle flag on or off. able A 1 able B 1 able B 2 ree Parallel Memory igure 6: Multiple database tables, table folds, and unused parallel memory.folds, and unused parallel memory. Busy / Idle R AID CO56 AA123 UA722 AA1223 LA LON AL HDG AS By necessity, data memory management becomes the responsibility of the parallel database server instead of the parallel compiler or other system software [11]. A coalescing parallel memory manager (CPMM) was developed to keep track of table, field and fold addresses. igure 7 illustrates some of the administrative data structures that must be maintained for folded tables. CO igure 5: light table superimposed onto the s and memory of a SIMD computer. Database tables are dynamic objects; there is typically no a priori knowledge of the number of table records. If the number of records in a table exceeds the number of physical s in the system (parallel memory overflow) the database server will use a cyclical data placement strategy when inserting new records. his is a form of virtual parallelism that is maintained by the parallel database server and not the operating system. his cyclical placement will manage multiple tables with multiple folds in an interleaved fashion as determined by the amount of data in the tables. or example, in igure 6, able A utilizes only 4 s, while the number of records in able B has exceeded the number of s resulting in multiple folds. An insert operation for able A will add a new record into the area occupied by fold 1 for table A. or table B, the next insert operation will be in fold 2. If enough records are added to able A to exceed the capacity of fold 1, a new fold will be created in the free memory space to the right of fold 2 of able B [11]. igure 7: Data structures for logical database tables, folds, records, and fields. he parallel database server will maintain the controlling data structures to manage the database, tables, folds, records, and column addresses. hese data structures reside in the sequential memory of the control unit or front-end computer. Dark shaded regions in this figure represent active records in the table. Note that there are two folds for this table and the field list is replicated for each fold. he table fold is an absolute parallel memory address while the field address is a relative parallel memory address. By adding the two memory addresses together, the physical memory address for a database field
5 within a fold can be determined. he parallel memory manager also created extra hidden table fields used for basic database operations (described in a later section). hese hidden table fields included several responder bits, a busy/idle flag, and timestamp fields for record insertion, selection, and update. 5 Sequential and Parallel Query Processing Query processing refers to the range of activities involved with extracting data from a database. he activities include translation of queries in high-level database languages into expressions that can be used at the physical (storage) level. he fundamental steps a database server must perform when processing a database query appear in igure 8: Query Parser and ranslator Relational Algebra Expression Optimizer 5.1 Sequential Query Processing Algorithms here are several sequential query processing algorithms defined in the literature [14] and [15]. Each algorithm has a particular use when the query processing evaluation takes place. he most relevant query processing algorithm related to this research is the A1 Linear Search algorithm, which is now described. In a linear search, the system scans each file block and tests all records to see whether they satisfy the selection condition. An initial seek is required to access the first block of the file. he cost of linear search, in terms of number of disk operations, is one seek plus b r block transfers, where b r denotes the number of blocks in the file. Equivalently, the time cost is t S + b r * t. Although the A1 Linear Search algorithm may be slower on sequential computers than other algorithms for implementing selection and other query processing tasks, it is the most natural algorithm in terms of conversion to a massively parallel equivalent since the linear search on a parallel variable can be accomplished in constant time on SIMD (or MASC) computers assuming the database can be held entirely in memory. Query Output Evaluation Engine Execution Plan 6 System Software Design and Architecture Data igure 8: Major functional components of database query processing. Before query processing can begin, the system must translate the query into a usable form. A language such as SQL is appropriate for software application development, but is not amenable to be the system s internal representation of a query. As shown in igure 3, the first step the system must take in query processing is to translate a given query into its internal form. his translation process is similar to the work performed by the parser of a compiler. In generating the internal form of the query, the parser checks the syntax of the user s query and verifies that the query names appear in the database. he system then constructs a parse tree representation of the query, which it then translates into a relational algebra expression. he sequence of steps in query processing is representative. Not all databases exactly follow these steps. However, the concepts that have been described form the basis of query processing in databases. Now that the basic parallel memory management issues have been addressed, the system software design of the database server is described. A client application will use the database driver manager to interface with the client database driver. he client database driver communicates with the SQL Engine. he SQL Engine will call process these instructions and then call the appropriate parallel database server, where there will be a corresponding function call to perform an operation in the memory of the parallel computer. he parallel database server will then receive the request from the database driver and control the databases, tables, records, and columns in the parallel memory. 6.1 Parallel SQL Insert Algorithm he task of the parallel SQL insert operation is to insert new data into a free record located anywhere in the table in any fold. An example of the SQL INSER statement is the following: INSER INO LIGHS( AID, LA, LON, AL, AS ) VALUES( CO128, 43.39, 83.67, 19, 45 ) his insert statement will insert a new record into the LIGHS table (reference the database table in igure 4)
6 and assign the respective values to the AID, LA, LON, AL, and AS fields. or inserting a record into a parallel memory space, the basic parallel insert algorithm is the following: Algorithm Par_SQL_Insert( RecordData ) open_record_found = ALSE or each table fold Perform associative search on the table s BI field where BI field is false (i.e. record is empty there may be multiple records returned) if ( idle records found ) select one record; BI = RUE open_record_found = RUE break // no open record is found // in any fold if ( open_record_found == ALSE ) create a new fold select first record in the new fold BI = RUE break Copy the data into the parallel memory record Return success or failure igure 8: Parallel SQL insert algorithm. he algorithm igure 8 begins by searching for an open or idle record in each of the table folds in turn. If idle records are found, then identification number and the fold select one record and field addresses are used to copy the data into the parallel memory record. If no idle record is found, then the parallel memory manager must create a new fold. his can be accomplished by allocating space from the unused space in parallel memory the same width as previous folds and recording the new base address in sequential memory. Since a new fold is created, the parallel memory manager can select any for the insertion; e.g. the first (lowest id number) can be used. he basic parallel search can be done in O(1) time. However, since each table fold may have to be scanned, the running time is O(#folds) which is typically small and normally still O(1) since the number of folds normally constant and not a function of higher complexity. 6.2 Parallel SQL Delete Algorithm he task of the parallel SQL delete operation is to delete records according to some searching or selection criteria. An example of the SQL DELEE statement is the following: DELEE ROM LIGHS WHERE AID = NW 545 /* delete criteria */ his delete statement will delete all records where the AID (aircraft ID) is NW 545. or deleting a record from the parallel memory space, the parallel delete algorithm is the following: Algorithm Parallel_SQL_Delete( DeleteCriteria ) returns Boolean or each table fold Perform associative search where the Delete Criteria is RUE and set responders appropriately If (the responder is RUE) Reset the Busy-Idle flag If all records in the current fold are idle CPMM marks the current fold as free Return success or failure igure 9: Parallel SQL delete algorithm. he algorithm in igure 9 begins by looping through each table fold and having each cell evaluate the appropriate fields as specified in the delete criteria clause. or those cells where the delete criteria clause is rue, the responder busy-idle flag is reset. If all the records in a given fold have their busy-idle flag reset, the coalescing parallel memory manager (CPMM) marks that fold as completely unused and returns it to the free pool of parallel memory. he basic parallel delete can be done in O(1) time. However, since each table fold will have to be scanned, the running time is O(#folds). 7 Conclusions and uture Work his research paper has presented an initial design of an in-memory database engine utilizing Intel Xeon Phi co-processors. Also presented was the system software design and interface for sequential programs and applications to interface with the server. his was achieved by designing and developing the set of algorithms for common database operations that would support the functionality of a parallel database server. he SQL operations presented include insert and delete and execute in O(#folds) steps. he update and selection operations are similar. he design of a coalescing parallel memory manager was also developed to manage large tables and virtual parallelism. An area of future research explores how the parallel database handles virtual parallelism. he present design uses a coalescing parallel memory manager to control the table folds in the memory of the parallel computer. his parallel memory manager is a built in component of the parallel database server because the Xeon Phi environment assumed that the number of processing elements was fixed at runtime and could not change.
7 Another area of future research could explore how the tables, records, and fields are physically mapped to the memory of the parallel computer. Presently, the parallel memory map as shown in igures 6 and 7 indicate that the parallel variables are allocated for all processing elements in a given fold regardless of the number of records actually storing information. his leads to a waste of processing elements for tables with only a few records. 8 References [1] Babb, E, Implementing a Relational Database by Means of Specialized Hardware, ACM ransactions on Database Systems, Vol., 4, No. 1, 1979, pp [2] Batcher, Kenneth, he SARAN Series E, Proceedings of the International Conference on Parallel Processing, 1977, pp [3] Berra, P. Bruce, Some Problems in Associative Processor Applications to Database Management, National Computer Conference and Exposition, Vol. 43, 1974, pp [4] Deiore, Casper R, P. Bruce Berra, A Quantitative Analysis of the Utilization of Associative Memories in Data Management, IEEE ransactions on Computers, Vol. c-23, No. 2, ebruary, 1974, pp [5] Hsiao, D., and M. J. Menon, Design and Analysis of a Multi-Backend Database System for Performance Improvement, unctionality Expansion and Growth (Part 1), echnical Report, OSU-CISRC-R-81-7, he Ohio State University, Columbus, Ohio, July, [6] Jin, Mingxian, Johnnie Baker, and Kenneth Batcher, "imings for Associative Operations on the MASC Model", Proc. of the 15th International Parallel and Distributed Processing Symposium (Workshop in Massively Parallel Processing), abstract on page 193, full text on CDROM, April 21. [7] Lockheed Martin Compary - formerly Loral Defense Systems, ASPRO-VME Parallel/Associative Computer: echnical Overview, Oct [8] Meilander, Will, Johnnie Baker, and Mingxian Jin, "Importance of SIMD Computation Reconsidered", Proc. of the 17th International Parallel and Distributed Processing Symposium (Workshop on Massively Parallel Processing), abstract on page 266, full text on CDROM, April 23. [9] Potter, Jerry L., Associative Computing: A Programming Paradigm for Massively Parallel Computers, Plennum Press, New York, NY, [1] Potter, Jerry, Johnnie Baker, Stephen Scott, Arvind Bansal, Chokchai Leangsuksun, and Chandra Asthagiri, ASC: An Associative Computing Paradigm, Computer, Nov. 1994, pp [11] Scherger, Michael, An Object Model ramework, Runtime Environment Support, and Database System Software for a Multiple Instruction Stream Associative Model of Parallel Computation, PhD Dissertation, Department of Computer Science, Kent State University, Kent, Ohio 25. [12] Michael Scherger, Johnnie Baker, and Jerry Potter, "Multiple Instruction Stream Control for an Associative Model of Parallel Computation", Proc. of the 16th International Parallel and Distributed Processing Symposium, abstract on page 266, full text on CDROM, April 23. [13] Scherger, Michael, Johnnie Baker, and Jerry Potter, "An Object Oriented ramework for and Associative Model of Parallel Computation", Proc. of the 16th International Parallel and Distributed Processing Symposium, abstract on page 166, full text on CDROM, April 23. [14] Scherger, Michael, On the Design of a Massively Parallel Database Server for In-Memory and Real ime Database Applications, Proceedings of the Int l Conf. on Parallel and Distributed Processing echniques (PDPA), Las Vegas, June, 27. [15] Silberschatz, Abraham, Henry. Korth, and S. Sudarshan, Database System Concepts, 4 th ed., McGraw-Hill, Boston, MA, 22. [16] Su, Stanley Y. W., Database Computers: Principles Architectures and echnologies, McGraw-Hill, New York, NY, [17] Intel Corporation, Intel Xeon Phi Coprocessor Developer s Quick Start Guide, Version 1.7, 213. [18] Intel Corporation, Intel Xeon Phi Coprocessor Architecture, 213. [19] Intel Corporation, Intel Xeon Phi Coprocessor System Software Developer s Guide, 213.
Rethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
Query Optimization in Teradata Warehouse
Paper Query Optimization in Teradata Warehouse Agnieszka Gosk Abstract The time necessary for data processing is becoming shorter and shorter nowadays. This thesis presents a definition of the active data
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus A simple C/C++ language extension construct for data parallel operations Robert Geva [email protected] Introduction Intel
Parallel Computing. Benson Muite. [email protected] http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage
Parallel Computing Benson Muite [email protected] http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework
Solving a 2D Knapsack Problem Using a Hybrid Data-Parallel/Control Style of Computing
Solving a D Knapsack Problem Using a Hybrid Data-Parallel/Control Style of Computing Darrell R. Ulm Johnnie W. Baker Michael C. Scherger Department of Computer Science Department of Computer Science Department
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU
Computer Science 14 (2) 2013 http://dx.doi.org/10.7494/csci.2013.14.2.243 Marcin Pietroń Pawe l Russek Kazimierz Wiatr ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Abstract This paper presents
Hardware/Software Co-Design of a Java Virtual Machine
Hardware/Software Co-Design of a Java Virtual Machine Kenneth B. Kent University of Victoria Dept. of Computer Science Victoria, British Columbia, Canada [email protected] Micaela Serra University of Victoria
Enterprise Applications
Enterprise Applications Chi Ho Yue Sorav Bansal Shivnath Babu Amin Firoozshahian EE392C Emerging Applications Study Spring 2003 Functionality Online Transaction Processing (OLTP) Users/apps interacting
Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election
6. Storage and File Structures
ECS-165A WQ 11 110 6. Storage and File Structures Goals Understand the basic concepts underlying different storage media, buffer management, files structures, and organization of records in files. Contents
Pattern Insight Clone Detection
Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology
IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus
Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 CELL INTRODUCTION 2 1 CELL SYNERGY Cell is not a collection of different processors, but a synergistic whole Operation paradigms,
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
Assessing the Performance of OpenMP Programs on the Intel Xeon Phi
Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Dirk Schmidl, Tim Cramer, Sandra Wienke, Christian Terboven, and Matthias S. Müller [email protected] Rechen- und Kommunikationszentrum
AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping
AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping 3.1.1 Constants, variables and data types Understand what is mean by terms data and information Be able to describe the difference
CHAPTER 17: File Management
CHAPTER 17: File Management The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
Computer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 14: DATA STORAGE AND REPRESENTATION Data Storage Memory Hierarchy Disks Fields, Records, Blocks Variable-length
Database 2 Lecture I. Alessandro Artale
Free University of Bolzano Database 2. Lecture I, 2003/2004 A.Artale (1) Database 2 Lecture I Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 [email protected] http://www.inf.unibz.it/
Chapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3
Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM
www.quilogic.com SQL/XML-IMDBg GPU boosted In-Memory Database for ultra fast data management Harald Frick CEO QuiLogic In-Memory DB Technology
SQL/XML-IMDBg GPU boosted In-Memory Database for ultra fast data management Harald Frick CEO QuiLogic In-Memory DB Technology The parallel revolution Future computing systems are parallel, but Programmers
In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller
In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency
Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.
Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap. 1 Oracle9i Documentation First-Semester 1427-1428 Definitions
2) What is the structure of an organization? Explain how IT support at different organizational levels.
(PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different
An Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
Scalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61
F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase
In-memory Tables Technology overview and solutions
In-memory Tables Technology overview and solutions My mainframe is my business. My business relies on MIPS. Verna Bartlett Head of Marketing Gary Weinhold Systems Analyst Agenda Introduction to in-memory
Storage Management for Files of Dynamic Records
Storage Management for Files of Dynamic Records Justin Zobel Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia. [email protected] Alistair Moffat Department of Computer Science
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases Paul L. Bergstein, Priyanka Gariba, Vaibhavi Pisolkar, and Sheetal Subbanwad Dept. of Computer and Information Science,
Accelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
Mimer SQL Real-Time Edition White Paper
Mimer SQL Real-Time Edition - White Paper 1(5) Mimer SQL Real-Time Edition White Paper - Dag Nyström, Product Manager Mimer SQL Real-Time Edition Mimer SQL Real-Time Edition is a predictable, scalable
Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Architectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
A Case Study - Scaling Legacy Code on Next Generation Platforms
Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 00 (2015) 000 000 www.elsevier.com/locate/procedia 24th International Meshing Roundtable (IMR24) A Case Study - Scaling Legacy
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada [email protected] Micaela Serra
Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices
Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil
Evaluation of Sub Query Performance in SQL Server
EPJ Web of Conferences 68, 033 (2014 DOI: 10.1051/ epjconf/ 201468033 C Owned by the authors, published by EDP Sciences, 2014 Evaluation of Sub Query Performance in SQL Server Tanty Oktavia 1, Surya Sujarwo
Attaining EDF Task Scheduling with O(1) Time Complexity
Attaining EDF Task Scheduling with O(1) Time Complexity Verber Domen University of Maribor, Faculty of Electrical Engineering and Computer Sciences, Maribor, Slovenia (e-mail: [email protected]) Abstract:
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
White Paper. Optimizing the Performance Of MySQL Cluster
White Paper Optimizing the Performance Of MySQL Cluster Table of Contents Introduction and Background Information... 2 Optimal Applications for MySQL Cluster... 3 Identifying the Performance Issues.....
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD
White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
Chapter 13. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France
File System Management
Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation
Oracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5
Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.
Database Fundamentals
Database Fundamentals Computer Science 105 Boston University David G. Sullivan, Ph.D. Bit = 0 or 1 Measuring Data: Bits and Bytes One byte is 8 bits. example: 01101100 Other common units: name approximate
SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
Infrastructure Matters: POWER8 vs. Xeon x86
Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1
Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING 2013/2014 1 st Semester Sample Exam January 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc.
Cassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
Performance Tuning for the Teradata Database
Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document
Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark
Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark In-memory database systems (IMDSs) eliminate much of the performance latency associated with traditional on-disk
Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a
SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)
WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Partitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
How To Use The Correlog With The Cpl Powerpoint Powerpoint Cpl.Org Powerpoint.Org (Powerpoint) Powerpoint (Powerplst) And Powerpoint 2 (Powerstation) (Powerpoints) (Operations
orrelog SQL Table Monitor Adapter Users Manual http://www.correlog.com mailto:[email protected] CorreLog, SQL Table Monitor Users Manual Copyright 2008-2015, CorreLog, Inc. All rights reserved. No part
Raima Database Manager Version 14.0 In-memory Database Engine
+ Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
A Visualization System and Monitoring Tool to Measure Concurrency in MPICH Programs
A Visualization System and Monitoring Tool to Measure Concurrency in MPICH Programs Michael Scherger Department of Computer Science Texas Christian University Email: [email protected] Zakir Hussain Syed
QuickDB Yet YetAnother Database Management System?
QuickDB Yet YetAnother Database Management System? Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Department of Computer Science, FEECS,
Computer Systems Structure Main Memory Organization
Computer Systems Structure Main Memory Organization Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Storage/Memory
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
Trafodion Operational SQL-on-Hadoop
Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL
FPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
Chapter 13: Query Processing. Basic Steps in Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
Record Storage and Primary File Organization
Record Storage and Primary File Organization 1 C H A P T E R 4 Contents Introduction Secondary Storage Devices Buffering of Blocks Placing File Records on Disk Operations on Files Files of Unordered Records
IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE
IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE 1 M.PRADEEP RAJA, 2 R.C SANTHOSH KUMAR, 3 P.KIRUTHIGA, 4 V. LOGESHWARI 1,2,3 Student,
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation
Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719
Putting it all together: Intel Nehalem http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Intel Nehalem Review entire term by looking at most recent microprocessor from Intel Nehalem is code
2
1 2 3 4 5 For Description of these Features see http://download.intel.com/products/processor/corei7/prod_brief.pdf The following Features Greatly affect Performance Monitoring The New Performance Monitoring
An Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
Accelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
B.Sc (Computer Science) Database Management Systems UNIT-V
1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used
Rackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
GraySort and MinuteSort at Yahoo on Hadoop 0.23
GraySort and at Yahoo on Hadoop.23 Thomas Graves Yahoo! May, 213 The Apache Hadoop[1] software library is an open source framework that allows for the distributed processing of large data sets across clusters
Proposal of Dynamic Load Balancing Algorithm in Grid System
www.ijcsi.org 186 Proposal of Dynamic Load Balancing Algorithm in Grid System Sherihan Abu Elenin Faculty of Computers and Information Mansoura University, Egypt Abstract This paper proposed dynamic load
Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child
Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.
Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design
Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database
An Introduction to Computer Science and Computer Organization Comp 150 Fall 2008
An Introduction to Computer Science and Computer Organization Comp 150 Fall 2008 Computer Science the study of algorithms, including Their formal and mathematical properties Their hardware realizations
Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! [email protected]
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! [email protected] 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
Computer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
Understanding the Benefits of IBM SPSS Statistics Server
IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster
