Enabling Efficient OS Paging for Main-Memory OLTP Databases
|
|
|
- Winfred Lyons
- 10 years ago
- Views:
Transcription
1 Enabling Efficient OS Paging for Main-Memory OLTP Databases Radu Stoica École Polytechnique Fédérale de Lausanne Anastasia Ailamaki École Polytechnique Fédérale de Lausanne ABSTRACT Even though main memory is becoming large enough to fit most OLTP databases, it may not always be the best option. OLTP workloads typically exhibit skewed access patterns where some records are hot (frequently accessed) but many records are cold (infrequently or never accessed). Therefore, it is more economical to store the coldest records on a fast secondary storage device such as a solid-state disk. However, main-memory DBMS have no knowledge of secondary storage, while traditional disk-based databases, designed for workloads where data resides on HDD, introduce too much overhead for the common case where the working set is memory resident. In this paper, we propose a simple and low-overhead technique that enables main-memory databases to e ciently migrate cold data to secondary storage by relying on the OS s virtual memory paging mechanism. We propose to log accesses at the tuple level, process the access traces o ine to identify relevant access patterns, and then transparently re-organize the in-memory data structures to reduce paging I/O and improve hit rates. The hot/cold data separation is performed on demand and incrementally through careful memory management, without any change to the underlying data structures. We validate experimentally the data re-organization proposal and show that OS paging can be efficient: a TPC-C database can grow two orders of magnitude larger than the available memory size without a noticeable impact on performance.. INTRODUCTION Database systems have been traditionally designed under the assumption that most data is disk resident and is paged in and out of memory as needed. However, the drop in memory prices over the past 30 years led several database engines to optimize for the case when all data fits in memory. Examples include both research systems (MonetDB[6], H- Store[3], Hyper[4], Hyrise []) and commercial systems (Oracle s TimesTen[2], IBM s SolidDB [], VoltDB[4], SAP Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. DaMoN 3 Copyright 203 ACM Copyright 203 ACM /3/06...$.00. Hana[7]). Such systems assume all data resides in RAM and explicitly eliminate traditional disk optimizations that are deemed to introduce an unacceptable large overhead. Today storage technology trends have changed - along with the large main memory of modern servers, we have solid-state drives (SSDs) that can support hundreds of thousands of IOPS. In recent years, the capacity improvements of NAND flash-based devices outpaced DRAM capacity growth, accompanied by corresponding trends in cost per gigabyte. In addition, major hardware manufacturers are investing in competing solid-state technologies such as Phase-Change Memory. In OLTP workloads, record accesses tend to be skewed: some records are hot and accessed frequently (the working set), others are cold and accessed infrequently. Hot records should reside in memory to guarantee good performance; cold records, however, can be moved to cheaper external solid-state storage. Ideally, we want a DBMS that: a) has high performance, characteristic of main-memory databases, when the working set fits in RAM, and b) the ability of a traditional disk-based database engine to keep cold data on the larger and cheaper storage media, while supporting, as e ciently as possible, the infrequent cases where data needs to be retrieved from secondary storage. There is no straightforward solution as DBMS engines optimized for in-memory processing have no knowledge of secondary storage. Traditional storage optimizations, such as the bu er pool abstraction, the page organization, certain data structures (e.g. B-Trees), and ARIES-style logging are explicitly discarded in order to reduce overhead. At the same time, switching back to a traditional disk-based DBMS would forfeit the fast processing benefits of main-memory systems. In this paper, we take the first step toward enabling a main-memory database to migrate data to a larger and cheaper secondary storage. We propose to record accesses during normal system runtime at the tuple level (possibly using sampling to reduce overhead) and process the access traces o the critical path of query execution in order to identify data worth storing in memory. Based on the access statistics, the relational data structures are re-organized such that the hot tuples are stored as compactly as possible, which leads to improved main memory hit rates and to reduced OS paging I/O. The data re-organization is unintrusive since only the location of tuples in memory changes, while the internal pointer structure of existing data-structures and query execution remain una ected. In addition, the data
2 Transactions/sec (000s) GB RAM 8GB RAM 62GB RAM Start of swapping on FusionIO SSD Database size (GB) Figure : VoltDB throughput when paging to a SSD re-organization is performed incrementally and only when required by the workload. We implement the data re-organization proposal in a stateof-art, open source main-memory database, VoltDB [4]. Our experimental results show that a TPC-C database can grow 0 larger than the available DRAM memory without a noticeable impact on performance or latency, and present micro-benchmark results that indicate that a system can deliver reasonable performance even in cases where the working set does not fully fit in memory and secondary storage data is frequently accessed. In this paper we make the following contributions:. We profile the performance of a state-of-art main-memory DBMS and identify its ine ciencies in moving data to secondary storage. 2. We propose an unintrusive data re-organization strategy that separates hot from cold data with minimal overhead and allows the OS to e ciently page data to a fast solid-state storage device. 3. We implement the data re-organization technique in a state-of-art database and show that it can support a TPC- C dataset that grows 0 bigger than the available physical memory without a significant impact on throughput or latency. The remaining of this document is organized as follows. Section 2 details the motivation for our work; Section 3 describes the architecture of the data re-organization proposal, which is validated experimentally in Section 4; Section surveys the related work and, finally, Section 6 concludes. 2. MOTIVATION Our work is motivated by several considerations: i) by hardware trends that make storing data on solid-state storage attractive; ii) by the workload characteristics of OLTP databases; and iii) by the ine ciencies of existing systems, either traditional disk-based databases or newer main-memory optimized DBMS. 2. Hardware trends It is significantly cheaper to store cold data on secondary storage rather than in DRAM. In previous work [7] we computed an updated version of the -minute rule [0] and found that it is more economically to store a 200B tuple on a SSD if it is accessed less than approximately once every 00 DBMS engine Query Hot Cold Data-structures Sample accesses Heap files Access logs Insert/delete trx. Re-organization 2 Write access logs Writer thread Binary trees Hash Indexes 4 Read optimal tuple placement Figure 2: System Architecture 3 Process access logs Monitoring process a) Access frequency b) Memory allocation c) Misplaced tuples Network Storage minutes. In addition to the price, the maximum memory capacity of a server or the memory density can both pose challenges. Today, a high-end workstation can fits at most 4TB of main memory, servers can handle around 26GB of DRAM per CPU socket, while a U rack slot hosts less than 2GB of DRAM. 2.2 Skewed Accesses in OLTP workloads Real-life transactional workloads typically exhibit considerable access skew. For example, package tracking workloads for companies such as UPS or FedEx exhibit time-correlated skew. Records for a new package are frequently updated until delivery, then used for analysis for some time, and after are accessed only on rare occasions. Another example is the natural skew found on large e-commerce sites such as Amazon, where some items are much more popular than others are or where some users are much more active than the average. Such skewed accesses may change over time but typically not very rapidly (the access skews might shift over days rather than seconds). 2.3 Existing DBMS Architectures 2.3. Disk-based DBMS. One possible option is to use a disk-based database architecture that optimizes for the case data is on secondary storage. Traditional DBMS pack data in fixed-size pages, use logical pointers (e.g. page IDs and record o sets) that allows the bu er pool abstraction to move data between memory and storage transparently, and have specialized page replacement algorithms to maximize hit rates and reduce I/O latency. However, given current memory sizes and the lower latency of SSDs compared to HDDs, such optimizations might not be desirable. Stonebraker et al. [20] introduced a mainmemory database that is two orders of magnitude faster than a general purpose disk-based DBMS, while Harizopoulos et al. [2] showed that for transactional workloads the bu er pool introduces a significant amount of CPU overhead: more than 30% of execution time is wasted due to extra indirection layer. This overhead does not even include latching bu er pool pages at every tuple access, or the overhead of maintaining page-level statistics to implement the page replacement algorithm.
3 2.3.2 Default OS paging At the other extreme, a straightforward way of extending available memory is to keep main-memory databases unchanged and simply use the default OS paging. Unfortunately, we find such an approach to cause unpredictable performance degradation. We show in Figure the throughput of the VoltDB DBMS when runing the TPC-C benchmark [3] (please see Section 4 for the detailed experimental setup). The working set size of the TPC-C benchmark is constant, while the size of the database grows over time as new records are inserted in the Order, NewOrder, OrderLine, and History tables. We vary the amount of physical DRAM available to the OS and enable swapping to a 60GB Fusion iodrive PCIe SSD [9]. The scale factor is 6 (the initial database occupies.6gb) and in the beginning all data is memory resident. As the database size grows, the RAM available is exhausted, and the OS starts paging virtual memory pages that it deems cold to the SSD. Figure shows the trade-o between extending the memory size of a system by using flash memory and the performance penalty incurred. When extending the memory budget from 62GB to 220GB (3. increase), the throughput drops by 20%; increasing the main memory from 8GB to 76GB (9.8 increase), the throughput decreases by 26%; finally, when extending the memory size from 3GB to 6GB (3 increase), the throughput penalty is 66%. The results are sub-optimal as the TPC-C working set fits in memory even if the database grows. Clearly, the performance impact of paging is significant and needs consideration. 3. DATA RE-ORGANIZATION Our data re-organization proposal, as depicted in Figure 2, is composed of five independent steps: First, tuple level accesses are logged as part of query execution (phase one); then, the access logs are shipped (phase two) for the processing stage that identifies which tuples are hot or cold (phase three); finally, the output of the processing stage is propagated back to the database engine (phase four) that performs the data re-organization when needed (phase five). The rest of this section details each step of the data re-organization process. Sample accesses. In the first phase, each worker thread samples probabilistically tuple accesses and writes the access records to a dedicated log-structure (a circular bu er). Each worker thread has one such log-structure in order to avoid any synchronization-related overhead. Time is split in discreet quanta, each time quantum receiving a single time stamp to further reduce overhead and reduce the access log size. Each access log quantum has the following syntax: <TimeStamp, AccessRecord*>, while an AccessRecord is composed of <ObjectID, [Key TupleID], FileO set>. The ObjectID is an identifier that represents the relational object datastructure containing the tuple (can be either a table or an index). The next field, [Key TupleID], identifies the tuple accessed; we use the primary table key or indexing key, denoted as Key, for this purpose. However, the primary key can be replaced with the tuple ID in systems that such support such identificators. The last field, FileO set, presents the memory o set of the tuple relative to the beginning of the relational data-structure. FileO set is used during the processing phase to determine if the location of a tuple matches its access frequency (i.e. whether the position of a tuple matches its cold/hot status). The total length of an AccessRecord is B +4B +8B =3B. A quantum of s is used as it is su ciently fine grained to di erentiate between tuples with di erent access frequencies at the time granularity of interest (as mentioned, a cold tuple should be moved to a SSD if accessed less than once every tens to hundreds of minutes). We experimented with multiple sampling probabilities and decided to use a default sampling probability of 0% (we log at most 0% of the total tuple accesses) that o ers a good tradeo between logging overhead and precision. 2 Write access logs. A single dedicated thread writes the access logs either to a file or to the network. The writer thread has three functions. Firstly, it decouples transaction execution from any other I/O or communication overhead. Secondly, it throttles the rate at which the logs are generated to insure minimal system interference. If the access logs are not flushed fast enough and fill up, the backpressure will cause worker threads to reduce the sampling frequency. Thirdly, the design allows us to decouple the database engine from the later processing stage of the access logs. 3 Process access logs. The access traces are processed o ine to identify data placement ine ciencies. The analysis step takes place ideally on a separate server to avoid interference with ongoing query execution. The output of the processing step is: a) a memory budget for each datastructure (table or index), and b) two lists of hot/cold misplaced tuples for each object. A hot misplaced tuple list identifies which tuples are accessed often enough to justify storing them in memory and are not yet placed in the hot memory region of the object, while the cold misplaced tuple list represents the tuples that are the best candidates for eviction from the hot memory region of each object. The access traces are processed as follows: Step - estimate access frequency. The access frequency is estimated using a variant of the Exponential Smoothing algorithm [7], although any other standard algorithms such as ARC[8], CAR [], or LRU-K [9] variants can be used. The advantages of the Exponential Smoothing algorithm over other frequency estimation algorithms are twofold: i) it estimates more accurately access frequency, resulting in higher hit rates, and ii) it is e cient as it supports parallelization and minimizes the number of access records processed (by scanning the access logs in reverse time order). Step 2 - compute optimal memory allocation. The access frequencies alone are not su cient to decide what data should be stored in memory we also need to consider the size of each tuple to maximize the access density. We use the average tuple size corresponding to each object to normalize the tuple access frequency. After the normalization, the hot tuples of each object can be identified. The memory budget of each object is simply the number of tuples qualifying as hot multiplied by the average tuple size. Step 3 - identify misplaced tuples. Finally, we identify misplaced tuples based on the memory allocation of each object. As mentioned, each tuple access contains the relative memory o set of the tuple. The relational data structures are allocated contiguous in memory (as described below), thus we can identify if the access frequency of a tuple matches its location by simply comparing the tuple s relative memory o set with the hot memory budget of the relation object.
4 New inserts Hot Data Region Shrink/Grow Hot free space list Tuple migration Cold Data Region (a) Memory allocation strategy Grow Cold free space list Initial layout Data re-organization Initial layout Data re-organization Hot Cold Hot Cold Binary Tree 7 Hash Index (b) Node re-organization for binary trees and hash indexes Figure 3: Memory allocation and hot/cold data placement 4 Read optimal tuple placement. A dedicated thread reads the output of the processing step. It first reads and sets the target memory budget for each object, then incrementally reads the lists of misplaced tuples for each object as needed by the tuple re-organization. Re-organization. Each data-structure is allocated memory sequentially in the virtual address space of the database process using the facilities of the OS. The memory layout of a data-structure is presented in Figure 3(a). The beginning portion of the data file is logically considered hot and is pinned in memory. The rest of the file is left to contend for system RAM and uses the default OS paging policy. The OS is left to manage at least 0% of available memory to insure there is enough memory for other processes. The tuple re-organization is performed without changing the pointer structure or record formant of the relational objects we change only how memory is allocated as depicted in Figure 3(b). The memory allocator of each data-structure has two free-space lists, one for the hot memory region, and the other for the cold memory region. When inserting, the data-structure can request either hot or cold memory; when deleting, memory is reclaimed by comparing the memory o set with the hot/cold threshold and appending the memory region to the corresponding free space list. A tuple movement can be logically thought of as a delete operation of the misplaced tuple followed by its re-insertion in the appropriate memory region. Overall, the key insights of our technique are as follows: i) we decouple, to the extent possible, all data re-organization operations (logging, analysis, re-organization) from the critical path of query execution, ii) we optimize data locality such that hot tuples are stored compactly in a contiguous memory region, and iii) we restrict OS paging decisions by preventing code memory sections and the hot data regions of relational objects from being paged out. 4. EVALUATION In this section, we evaluate experimentally how e ective is the data re-organization technique in reducing OS paging overhead. We first demonstrate the end-to-end performance of a main-memory DBMS when running a TPC-C benchmark where the majority of the database resides on an SSD. We show that the data re-organization strategy and paging-related I/O have little impact on the overall system throughput or latency even when the database size outgrows the RAM size by a factor of more than 0. As a TPC-C database has a predictable working set and data growth, we then explore the impact of a full workload change on the overall system performance through a micro-benchmark. 4. Experimental Setup DBMS. We use the open-source commercial VoltDB [4] DBMS (version 2.7) running on Linux to implement the data re-organization technique. VoltDB uses data partitioning to handle concurrency and insure scaling with the number of cores. Each worker thread has its own set of data structures (both tables and indexes) that it can access independently of the other worker threads. Thus, each logical table is composed of several physical tables, each worker thread being assigned one such physical table. In all experiments, the database is partitioned to match the number of available cores. Memory management. The core VoltDB functionality is implemented in C++, while the front-end (network connectivity, serialization of results, query plan generation, DBMS API) are implemented in Java. The memory overhead of the front-end is independent of the database size and we found it crucial to keep memory resident the frontend related objects and code. We pin in memory all code mappings, front-end data structures, and Java heap and allow only the large relational data-structures (tables, hash and binary tree indexes) to be paged out to the SSD. We track which address ranges correspond to code sections or to other data-structures by examining the /proc process information pseudo-file system. For relational objects, we allocate memory sequentially in the virtual address space of the database process by using the mmap()-related system calls and pin/unpin pages in memory using the mlock()/munlock() system calls. We note that the VoltDB core engine already optimizes memory allocation. VoltDB allocates memory in large contiguous chunks for each object in order to reduce malloc() call overhead and reduce memory fragmentation. Each relational data-structure has its own memory allocation pool and memory chunks never contain data from more than one table or index, which insures that virtual memory pages contain data of the same type. Therefore, the performance benefits of the data re-organization stem only from taking into account access frequency rather than from other memory management optimizations. Query execution. Transactions run as stored procedures (no query optimization is performed at runtime), with query execution and result externalization being handled by separate threads. The benchmark client, responsible for submitting transactions, is placed on a di erent machine on the same 0Gb local network. For throughput results, we make sure the server is fully loaded by maintaining su cient inflight transactions and measure throughput every second in all experiments. The network bandwidth is never saturated and the network latency does not a ect any throughput re-
5 Transactions/sec (000s) 3 30 In-memory Data-reorganization Default paging 2 20 Database size > physical RAM Database size (GB) Figure 4: TPC-C throughput. Trx.Name Server latency(µs) Client latency(µs) No Paging Paging No Paging Paging avg avg avg avg NewOrder OrderStatus Payment Delivery StockLevel avg = average, = standard deviation Table : TPC-C transaction latency sults due to the batch-style processing architecture. Hardware. In all our experiments we use a 4 socket Quad-Core AMD Opteron (6 cores in total) equipped with 64GB of physical DRAM (the amount of memory visible to the OS varies according to each experiment). The paging device is a single 60GB FusionIO PCIe SSD that can support, according to our measurements, up to 6,000 4kB random read IOPS and between 20,000 (long term throughput) and 7,000 (peak throughput) 4kB random write IOPS. 4.2 TPC-C Results We use the TPC-C benchmark [3] to validate the overall e ciency of the data re-organization strategy. The scaling factor is set to 6 to match the number of cores and the database is partitioned on the warehouse ID (standard TPC- C partitioning), where each worker thread is responsible for executing all transactions related to its given warehouse. To maximize the number of growing tables and the database growing speed, we disable for the Delivery transaction the deletions of fulfilled orders from the NewOrder table Throughput We measure the TPC-C transaction execution throughput in three scenarios. In the first case, we run the unmodified VoltDB database with all of the 64GB of RAM available to the OS ( 62GB were useful for actual data storage). In the second scenario, we run TPC-C on the same unmodified engine but restrict the amount of physical memory to GB ( 3GB useful memory) and turn on swapping to the SSD. In the third case, we run VoltDB modified with our data reorganization technique and maintain the same memory configuration ( 3GB of useful memory); the memory mapped data files are backed directly by the SSD without an inbetween file-system. As shown in Figure 4, the hot/cold data separation stabilizes throughput within 7% of the in-memory performance, although the actual data stored grows 0 larger than the amount of RAM available. On the other hand, the throughput of the unmodified engine drops by 66% when swapping to the SSD Latency Paging potentially introduces I/O operations in the critical path of a transaction and a relevant question is how SSD I/O changes transaction latency. We repeat the same TPC- C experiments, only this time we throttle the maximum number of transaction at 0% of the maximum throughput (to prevent transaction latency from including queuing times). We show in Table the transaction latency as measured from both the server- and client- side and compute for each transaction its standard deviation. As shown, paging increases the average transaction latency by 0µs and its standard deviation by 00µs. However, the variability increase is small relative to the total transaction execution time and becomes insignificant from the client perspective. The transaction latency, as experienced by the client, is dominated not by the actual transaction execution but rather by the software overhead of submitting a transaction to the server, by the overhead of the network I/O, and finally by transaction management on the server side. To put the values into perspective, the advertised I/O latency for the Fusion iodrive SSD is 2µs and the network latency (roundtrip wire latency plus switching) is 30µs. 4.3 Response to workload changes The TPC-C working set size is fixed (the size of the hot data does not change over time) and only 4 tables out of 9 are growing. In addition, accesses show a predictable time correlation: the newest tuples in the growing tables are also the hottest. We expect that in most workloads such a time correlation exits, nonetheless we want to understand the stability of the system if the working set shifts in unpredictable ways or if the working set does not fully fit in RAM. To answer these questions, we develop a micro-benchmark with a dual purpose: to measure performance for the case where the working set does not fully fit in memory, and to test how fast the system reacts to workload changes. We generate a database composed of B tuples indexed by a 4B integer primary key. The total memory footprint of the VoltDB DBMS (the table, index, plus VoltDB memory overhead) is 26.2GB. The database size is constant and the available memory is to GB (physical memory is 20% of the total database size). We execute a single query that retrieves single tuples of the form: SELECT * FROM R WHERE PK = <ID> ; The ID values for the select query are generated according to a Zipfian distribution with skew factor (80% of accesses target 20% of data). To measure how fast the system adapts to a workload change, we randomize the set of hot tuples
6 while preserving the same access distribution and measure how throughput varies over time. We show in Figure the impact of the working set shift on the throughput of the system. We distinguish five distinct phases. The initial steady-state (phase ) represents the long term behavior of the system when the table and index data structures are fully optimized according to the access pattern. Once the workload shift occurs, the performance of the system is immediately a ected (phase 2) and drops by 96% from 7,000 to 4,00 queries/sec. The OS page replacement algorithm first reacts to the workload shift (phase 3) as the hottest pages from the cold file portions are identified and bu ered in memory. As the access distribution contains significant skew, even the small memory budget available to the OS is enough to capture the most frequently accessed pages. As a result, throughput improves to around 2,000 queries per second. The throughput remains stable (phase 4) until the data re-organization starts to relocate tuples (with the configurable delay of 60s). Once, the data relocation commences, the throughput quickly stabilizes to the original level of 6,000 queries/sec (phase ). We note that the unmodified VoltDB system is unable to execute any queries. The high memory pressure combined with the lack of a clear working set cause a high OS paging activity that renders the database unresponsive.. RELATED WORK There has been much work on exploring OLTP engine architectures optimized for main-memory access. Research prototypes include [6, 3, 4,, ] and already there are a multitude of commercial main-memory o erings [2,, 4, 7]. Such engines assume that the entire database fits in memory, completely dropping the concept of secondary storage. We diverge from the memory-only philosophy by considering a scenario where a main-memory database may migrate cold data to secondary storage by using OS paging and investigate how to e ciently separate hot from cold data in order to improve memory hit rate and reduce I/O. The work closest to ours is HyPer s cold-data management scheme [8]. In Hyper, hot transactional data is identified and separated from cold data that might be only referenced by analytical queries. Once identified and separated, the cold data is compressed to reduce its memory footprint in a read-optimized format suitable for the analytical queries. However, Hyper s data re-organization scheme has a di erent goal, namely to reduce the overhead of creating new OLAP threads. This is achieved by minimizing the number of pages that are actively modified by the OLTP threads and by compressing and storing the OLAP-only data on large virtual memory pages. Hyper also takes a di erent approach for implementing the data re-organization: it keeps access statistics at a larger granularity (at the VM page level) by overriding the OS virtual memory infrastructure, and uses an online heuristic to separate hot from cold data as an integral part of query execution. Compression techniques can reduce the memory footprint of a database and can help in supporting larger in-memory datasets. Compression, however, is an orthogonal topic. Compressing relational data, without changing the tuple order inside the underlying data-structure, does not create a separation of cold from hot data, but rather can be viewed as a technique for expanding the available system memory. The incentives of separating data based on the access fre- Queries /sec (000s) In-memory Data-reorganization Uniform accesses Time(s) Figure : Throughput for a shifting Zipfian working set. quency remain the same. In addition, if data is stored based on its access frequency, the hot and cold datasets can be compressed through di erent techniques. A similar idea is used in [8] where only the cold data is compressed. 6. CONCLUSIONS Our data re-organization proposal is generally applicable as it does not change the pointer structure of the physical data-structures, or the concurrency mechanism of the database engine. The hot/cold data separation can be simply thought of a better way of allocating memory by taking into account access frequency, while the tuple re-organization can be modeled as short search/delete/insert transaction of individual tuples. At the same time, our proposal can be suboptimal: some data structures maintain invariants that might force accessing cold data along the way of retrieving hot tuples. For example, the hash index proposed in [6] uses linear hashing, i.e. maintains tuples sorted on keys inside the hash buckets to allow incremental expansion of the hash index. Our data re-organization technique cannot distinguish between logically cold and hot data as it relies on physical accesses; cold tuples in a hash bucket accessed on the way to the target hot tuple are considered as hot as the target tuple. Possible solutions for such cases could be either to change the original data-structures or to use two di erent data stores, one data store for the hot and the other store for the cold data. Unfortunately, both approaches have deeper architectural implications. We proposed a simple solution for a main-memory database to e ciently page cold data to secondary storage. Our technique logs accesses at the tuple level, processes the access logs o ine to identify hot data and re-organizes accordingly the in-memory data structures to reduce paging I/O and improve memory hit rates. The hot/cold data separation is performed on demand and incrementally through careful memory management, without any change to the underlying data structures. Our experimental results show that it is feasible to rely on the OS virtual memory paging mechanism to move data between memory and secondary storage even if the database size grows much larger than the main memory. In addition, the hot/cold data re-organization o ers reasonable performance even in the cases where the working set does not fully fit in memory and can adapt in a time frame of a few minutes to full workload shifts.
7 7. REFERENCES [] IBM SolidDB. Available: [2] Oracle TimesTen In-Memory Database. Information available at: [3] Transaction Processing Performance Council TPC-C Standard Specification. Available: [4] VoltDB In-Memory Database. Available: [] S. Bansal and D. S. Modha. CAR: Clock with adaptive replacement. In FAST, [6] P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X00: Hyper-pipelining query execution. In CIDR, 200. [7] F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner. SAP HANA database: data management for modern business applications. ACM Sigmod Record, 202. [8] F. Funke, A. Kemper, and T. Neumann. Compacting transactional data in hybrid OLTP & OLAP databases. In VLDB, 202. [9] Fusion IO. Technical specifications. Available: [0] J. Gray and F. Putzolu. The minute rule for trading memory for disc accesses and the 0 byte rule for trading memory for CPU time [] M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. Hyrise: a main memory hybrid storage engine. In VLDB, 200. [2] S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stonebraker. OLTP through the looking glass, and what we found there. In SIGMOD, [3] R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. Jones, S. Madden, M. Stonebraker, Y. Zhang, et al. H-store: a high-performance, distributed main memory transaction processing system. In VLDB, [4] A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, 20. [] P.-Å. Larson, S. Blanas, C. Diaconu, C. Freedman, J. M. Patel, and M. Zwilling. High-performance concurrency control mechanisms for main-memory databases. In VLDB, 20. [6] P.-Å. Larson, S. Blanas, C. Diaconu, C. Freedman, J. M. Patel, and M. Zwilling. High-performance concurrency control mechanisms for main-memory databases. In VLDB, 20. [7] J. Levandoski, P.-A. Larson, and R. Stoica. Identifying Hot and Cold data in Main-Memory Databases. In ICDE, 203. [8] N. Megiddo and D. S. Modha. ARC: A self-tuning, low overhead replacement cache. In FAST, [9] E. J. O Neil, P. E. O Neil, and G. Weikum. The LRU-K page replacement algorithm for database disk bu ering [20] M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era:(it s time for a complete rewrite). In VLDB, 2007.
In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki 30.11.2012
In-Memory Columnar Databases HyPer Arto Kärki University of Helsinki 30.11.2012 1 Introduction Columnar Databases Design Choices Data Clustering and Compression Conclusion 2 Introduction The relational
In-Memory Performance for Big Data
In-Memory Performance for Big Data Goetz Graefe, Haris Volos, Hideaki Kimura, Harumi Kuno, Joseph Tucek, Mark Lillibridge, Alistair Veitch VLDB 2014, presented by Nick R. Katsipoulakis A Preliminary Experiment
IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1
IN-MEMORY DATABASE SYSTEMS Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1 Analytical Processing Today Separation of OLTP and OLAP Motivation Online Transaction Processing (OLTP)
An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases
An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases Bernhard Höppner SAP SE Potsdam, Germany bernhard.hoeppner@ sap.com Ahmadshah Waizy Fujitsu Technology Solutions GmbH Paderborn, Germany
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What
SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK
3/2/2011 SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox Humboldt University Dec. 2010 Systems Group = www.systems.ethz.ch
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Write Amplification: An Analysis of In-Memory Database Durability Techniques
Write Amplification: An Analysis of In-Memory Database Durability Techniques Jaemyung Kim Cheriton School of Computer Science University of Waterloo [email protected] Kenneth Salem Cheriton School of
Virtuoso and Database Scalability
Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of
How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)
WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...
The Classical Architecture. Storage 1 / 36
1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage
The Data Placement Challenge
The Data Placement Challenge Entire Dataset Applications Active Data Lowest $/IOP Highest throughput Lowest latency 10-20% Right Place Right Cost Right Time 100% 2 2 What s Driving the AST Discussion?
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Anti-Caching: A New Approach to Database Management System Architecture
Anti-Caching: A New Approach to Database Management System Architecture Justin DeBrabant Andrew Pavlo Stephen Tu Brown University Brown University MIT CSAIL [email protected] [email protected] [email protected]
Storage in Database Systems. CMPSCI 445 Fall 2010
Storage in Database Systems CMPSCI 445 Fall 2010 1 Storage Topics Architecture and Overview Disks Buffer management Files of records 2 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query
An Evaluation of Strict Timestamp Ordering Concurrency Control for Main-Memory Database Systems
An Evaluation of Strict Timestamp Ordering Concurrency Control for Main-Memory Database Systems Stephan Wolf, Henrik Mühe, Alfons Kemper, Thomas Neumann Technische Universität München {wolfst,muehe,kemper,neumann}@in.tum.de
Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments
Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments Carsten Meyer Hasso Plattner Institute Potsdam, Germany [email protected] Jan Ole Vollmer Hasso Plattner Institute
Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator
WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
Comparison of Hybrid Flash Storage System Performance
Test Validation Comparison of Hybrid Flash Storage System Performance Author: Russ Fellows March 23, 2015 Enabling you to make the best technology decisions 2015 Evaluator Group, Inc. All rights reserved.
FAWN - a Fast Array of Wimpy Nodes
University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
Speeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,
Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark
Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark In-memory database systems (IMDSs) eliminate much of the performance latency associated with traditional on-disk
In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller
In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency
Maximizing SQL Server Virtualization Performance
Maximizing SQL Server Virtualization Performance Michael Otey Senior Technical Director Windows IT Pro SQL Server Pro 1 What this presentation covers Host configuration guidelines CPU, RAM, networking
The SAP HANA Database An Architecture Overview
The SAP HANA Database An Architecture Overview Franz Färber and Norman May and Wolfgang Lehner and Philipp Große and Ingo Müller and Hannes Rauhe and Jonathan Dees Abstract Requirements of enterprise applications
Accelerating Server Storage Performance on Lenovo ThinkServer
Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER
EMC VFCACHE ACCELERATES ORACLE
White Paper EMC VFCACHE ACCELERATES ORACLE VFCache extends Flash to the server FAST Suite automates storage placement in the array VNX protects data EMC Solutions Group Abstract This white paper describes
How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server
White Paper October 2014 Scaling MySQL Deployments Using HGST FlashMAX PCIe SSDs An HGST and Percona Collaborative Whitepaper Table of Contents Introduction The Challenge Read Workload Scaling...1 Write
Storage Class Memory Aware Data Management
Storage Class Memory Aware Data Management Bishwaranjan Bhattacharjee IBM T. J. Watson Research Center [email protected] George A. Mihaila IBM T. J. Watson Research Center [email protected] Mustafa Canim
<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database
1 Best Practices for Extreme Performance with Data Warehousing on Oracle Database Rekha Balwada Principal Product Manager Agenda Parallel Execution Workload Management on Data Warehouse
Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
Big Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
Software-defined Storage at the Speed of Flash
TECHNICAL BRIEF: SOFTWARE-DEFINED STORAGE AT THE SPEED OF... FLASH..................................... Intel SSD Data Center P3700 Series and Symantec Storage Foundation with Flexible Storage Sharing
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME? EMC and Intel work with multiple in-memory solutions to make your databases fly Thanks to cheaper random access memory (RAM) and improved technology,
DELL s Oracle Database Advisor
DELL s Oracle Database Advisor Underlying Methodology A Dell Technical White Paper Database Solutions Engineering By Roger Lopez Phani MV Dell Product Group January 2010 THIS WHITE PAPER IS FOR INFORMATIONAL
Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database
WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers. Keith Swindell Dell Storage Product Planning Manager
EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers Keith Swindell Dell Storage Product Planning Manager Topics Guiding principles Network load balancing MPIO Capacity load balancing
Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server
Express5800 Scalable Enterprise Server Reference Architecture For NEC PCIe SSD Appliance for Microsoft SQL Server An appliance that significantly improves performance of enterprise systems and large-scale
Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card
Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card Version 1.0 April 2011 DB15-000761-00 Revision History Version and Date Version 1.0, April 2011 Initial
Page Size Selection for OLTP Databases on SSD Storage
Page Size Selection for OLTP Databases on SSD Storage Ilia Petrov, Todor Ivanov, Alejandro Buchmann Databases and Distributed Systems Group, Department of Computer Science, Technische Universität Darmstadt
The Revival of Direct Attached Storage for Oracle Databases
The Revival of Direct Attached Storage for Oracle Databases Revival of DAS in the IT Infrastructure Introduction Why is it that the industry needed SANs to get more than a few hundred disks attached to
low-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
Dependency Free Distributed Database Caching for Web Applications and Web Services
Dependency Free Distributed Database Caching for Web Applications and Web Services Hemant Kumar Mehta School of Computer Science and IT, Devi Ahilya University Indore, India Priyesh Kanungo Patel College
Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database
Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database Ahmed Eldawy* University of Minnesota [email protected] Justin Levandoski Microsoft Research [email protected]
1 Storage Devices Summary
Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious
DATABASE. Pervasive PSQL Performance. Key Performance Features of Pervasive PSQL. Pervasive PSQL White Paper
DATABASE Pervasive PSQL Performance Key Performance Features of Pervasive PSQL Pervasive PSQL White Paper June 2008 Table of Contents Introduction... 3 Per f o r m a n c e Ba s i c s: Mo r e Me m o r y,
Understanding Data Locality in VMware Virtual SAN
Understanding Data Locality in VMware Virtual SAN July 2014 Edition T E C H N I C A L M A R K E T I N G D O C U M E N T A T I O N Table of Contents Introduction... 2 Virtual SAN Design Goals... 3 Data
Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments
Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,
Oracle TimesTen In-Memory Database on Oracle Exalogic Elastic Cloud
An Oracle White Paper July 2011 Oracle TimesTen In-Memory Database on Oracle Exalogic Elastic Cloud Executive Summary... 3 Introduction... 4 Hardware and Software Overview... 5 Compute Node... 5 Storage
All-Flash Storage Solution for SAP HANA:
All-Flash Storage Solution for SAP HANA: Storage Considerations using SanDisk Solid State Devices WHITE PAPER 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Preface 3 Why SanDisk?
FPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
The Bw-Tree Key-Value Store and Its Applications to Server/Cloud Data Management in Production
The Bw-Tree Key-Value Store and Its Applications to Server/Cloud Data Management in Production Sudipta Sengupta Joint work with Justin Levandoski and David Lomet (Microsoft Research) And Microsoft Product
In-Memory Data Management for Enterprise Applications
In-Memory Data Management for Enterprise Applications Jens Krueger Senior Researcher and Chair Representative Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University
FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures
Technology Insight Paper FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures By Leah Schoeb January 16, 2013 FlashSoft Software from SanDisk: Accelerating Virtual Infrastructures 1 FlashSoft
Using Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...
SSD Performance Tips: Avoid The Write Cliff
ebook 100% KBs/sec 12% GBs Written SSD Performance Tips: Avoid The Write Cliff An Inexpensive and Highly Effective Method to Keep SSD Performance at 100% Through Content Locality Caching Share this ebook
Lecture 1: Data Storage & Index
Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager
EMC XtremSF: Delivering Next Generation Performance for Oracle Database
White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
Data Center Storage Solutions
Data Center Storage Solutions Enterprise software, appliance and hardware solutions you can trust When it comes to storage, most enterprises seek the same things: predictable performance, trusted reliability
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
ioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
An Oracle White Paper May 2011. Exadata Smart Flash Cache and the Oracle Exadata Database Machine
An Oracle White Paper May 2011 Exadata Smart Flash Cache and the Oracle Exadata Database Machine Exadata Smart Flash Cache... 2 Oracle Database 11g: The First Flash Optimized Database... 2 Exadata Smart
Flash Memory Technology in Enterprise Storage
NETAPP WHITE PAPER Flash Memory Technology in Enterprise Storage Flexible Choices to Optimize Performance Mark Woods and Amit Shah, NetApp November 2008 WP-7061-1008 EXECUTIVE SUMMARY Solid state drives
Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS
Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS M ost storage vendors now offer all-flash storage arrays, and many modern organizations recognize the need for these highperformance
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage sponsored by Dan Sullivan Chapter 1: Advantages of Hybrid Storage... 1 Overview of Flash Deployment in Hybrid Storage Systems...
Accelerating Database Applications on Linux Servers
White Paper Accelerating Database Applications on Linux Servers Introducing OCZ s LXL Software - Delivering a Data-Path Optimized Solution for Flash Acceleration Allon Cohen, PhD Yaron Klein Eli Ben Namer
SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation
SQL Server 2014 New Features/In- Memory Store Juergen Thomas Microsoft Corporation AGENDA 1. SQL Server 2014 what and when 2. SQL Server 2014 In-Memory 3. SQL Server 2014 in IaaS scenarios 2 SQL Server
Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture I: Storage Storage Part I of this course Uni Freiburg, WS 2014/15 Systems Infrastructure for Data Science 3 The
Inge Os Sales Consulting Manager Oracle Norway
Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database
Oracle TimesTen: An In-Memory Database for Enterprise Applications
Oracle TimesTen: An In-Memory Database for Enterprise Applications Tirthankar Lahiri [email protected] Marie-Anne Neimat [email protected] Steve Folkman [email protected] Abstract In-memory
Accelerating MS SQL Server 2012
White Paper Accelerating MS SQL Server 2012 Unleashing the Full Power of SQL Server 2012 in Virtualized Data Centers Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba Group Company 1
EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES
ABSTRACT EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES Tyler Cossentine and Ramon Lawrence Department of Computer Science, University of British Columbia Okanagan Kelowna, BC, Canada [email protected]
Rackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
FAST 11. Yongseok Oh <[email protected]> University of Seoul. Mobile Embedded System Laboratory
CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh University of Seoul Mobile Embedded System Laboratory
Boost Database Performance with the Cisco UCS Storage Accelerator
Boost Database Performance with the Cisco UCS Storage Accelerator Performance Brief February 213 Highlights Industry-leading Performance and Scalability Offloading full or partial database structures to
Application-Focused Flash Acceleration
IBM System Storage Application-Focused Flash Acceleration XIV SDD Caching Overview FLASH MEMORY SUMMIT 2012 Anthony Vattathil [email protected] 1 Overview Flash technology is an excellent choice to service
EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server
White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
Overview: X5 Generation Database Machines
Overview: X5 Generation Database Machines Spend Less by Doing More Spend Less by Paying Less Rob Kolb Exadata X5-2 Exadata X4-8 SuperCluster T5-8 SuperCluster M6-32 Big Memory Machine Oracle Exadata Database
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation
Identifying Hot and Cold Data in Main-Memory Databases
Identifying Hot and Cold Data in Main-Memory Databases Justin J. Levandoski 1, Per-Åke Larson 2, Radu Stoica 3 1,2 Microsoft Research 3 École Polytechnique Fédérale de Lausanne 1 [email protected],
GraySort on Apache Spark by Databricks
GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner
NetApp FAS Hybrid Array Flash Efficiency. Silverton Consulting, Inc. StorInt Briefing
NetApp FAS Hybrid Array Flash Efficiency Silverton Consulting, Inc. StorInt Briefing PAGE 2 OF 7 Introduction Hybrid storage arrays (storage systems with both disk and flash capacity) have become commonplace
SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage
SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage Biplob Debnath,1 Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA EMC Corporation, Santa Clara, CA, USA ABSTRACT We present
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology
Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology Evaluation report prepared under contract with NetApp Introduction As flash storage options proliferate and become accepted in the enterprise,
89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 White Paper HP 3PAR Adaptive Flash Cache: A Competitive Comparison Printed in the United States of America Copyright 2014 Edison
SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
File System Management
Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation
Oracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
