SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery Ismail Oukid*, Daniel Booss, Wolfgang Lehner*, Peter Bumbulis, and Thomas Willhalm + *Dresden University of Technology SAP AG + Intel GmbH DaMoN 2014, Snowbird, Utah, USA, June 23, 2014 Prof. Dr.-Ing. Wolfgang Lehner Diese Zeile ersetzt man über: Einfügen > Kopf- und
What is Storage Class Memory? Non-Volatile Memory Byteaddressable (load/store semantics) Latency close to DRAM s, with writes slower than reads More energy efficient than DRAM Storage Class Memory Denser than DRAM (i.e. better scaling) e.g. PCM, STTRAM, MRAM and Memristors Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 2
SCM Compared to Other Technologies Latency Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 3
SCM Compared to Other Technologies Latency HDD Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 4
SCM Compared to Other Technologies Latency HDD DRAM Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 5
SCM Compared to Other Technologies Latency HDD Bridging the gap between memory and storage DRAM Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 6
SCM Compared to Other Technologies Latency HDD SSD Bridging the gap between memory and storage DRAM Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 7
SCM Compared to Other Technologies Latency HDD SSD Bridging the gap between memory and storage SCM DRAM Bandwidth Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 8
SCM Compared to Other Technologies Latency HDD SSD Bridging the gap between memory and storage SCM DRAM Bandwidth Storage Class Memory is a merging point between memory and storage Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 9
Main Memory DB Restart Time Availability guarantees are an important part of many SLAs Availability of 99% downtime of 3.65 days/year! Many database crash conditions are transient A reasonable approach to recovery: restarting the database Database restart time impacts directly database availability Long restart times: a shortcoming of main memory databases Improve database restart time using SCM Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 10
Overview of SOFORT (1) SOFORT: A hybrid SCM-DRAM storage engine for fast data recovery Requirements: Fast data recovery Mixed OLTP and OLAP A hybrid SCM-DRAM memory environment Architecture: Single-level column-store Dictionary encoded Multi version concurrency control (MVCC) No transactional log Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 11
Overview of SOFORT (2) SOFORT Table MVCC Entry CTS DTS Appendonly TX Array Tables MVCC Array Columns Column On DRAM to enable better performance Dictionary Encoded Persisted on SCM Volatile in DRAM Dict. Index Dict. Index (optional) Value IDs Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 12
Persisting Data on the Fly SCM accessed via the CPU cache: Writes not guaranteed to have reached SCM Need to flush data from cache Code reordered at compilation time & Complex CPU out-of-order execution Need to order memory operations Persistence primitives: flushing instructions (CLFLUSH), memory barriers (MFENCE, SFENCE, LFENCE) and non-temporal stores () Data may still be held in memory controller buffers Need to flush memory controller buffers on power failures Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 13
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; persisted = true; Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 14
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; persisted = true; Cache SCM var 0 0 persisted False False Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 15
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; persisted = true; Cache SCM var 0 0 persisted True True Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 16
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; persisted = true; Cache SCM var 1 0 persisted True True Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 17
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; MFENCE; persisted = true; Cache SCM var 0 0 persisted False False Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 18
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; Flush var; MFENCE; persisted = true; Cache SCM var 0 0 persisted False False Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 19
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; MFENCE; Flush var; MFENCE; persisted = true; Cache SCM var 0 0 persisted False False Flush persisted; Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 20
Persisting Data on the Fly: Example Int var = 0; Bool persisted = false; var = 1; MFENCE; Flush var; MFENCE; persisted = true; MFENCE; Flush persisted; MFENCE; Cache SCM var 0 0 persisted False False Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 21
Persistent Memory Allocator Persistent Memory File System (PMFS): SCM-aware file system No buffering in DRAM on mmap direct access to SCM PMAllocator: Huge PMFS files as memory pages Pages cut into segments for allocation Persistent counter of memory pages Mapping from persistent memory to VM PMFS file = Memory page Segment File name = Unique page ID Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 22
Persistent Memory Pointers Regular pointers are bound to the program s address space Cannot be used for recovery We propose Persistent Memory Pointers (PMPtrs): Struct PMPtr { int64_t m_baseid; ptrdiff_t m_offset; } Persistent memory page ID Offset indicating the start of the allocated block Can be converted (swizzled) to regular pointers PMPtrs stay valid across failures Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 23
Recovery Mechanism PM Entry Point Reload PMAllocator Reload SOFORT A PMPtr to the PMAllocator A PMPtr to the storage engine Reload memory pages Reconstruct mapping to VM Rollback unfinished transactions Allocate ressources for recovery Reload Table M Reload Table 1 Data structures sanity check Initiate columns recovery Reload column N Reload column 1 Data structures sanity check Reconstruct dictionary indexes Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 24
Evaluation: Setup Hardware-based SCM simulation: Intel Xeon E5 @2.60Ghz, 20MB L3 cache, 8 physical cores/socket Avoiding NUMA effects: benchmark run on a single socket Limitation: symmetric instead of asymmetric read/write latency Socket 0 Socket 1 QPI Link CPU (8 cores) CPU (8 cores) Memory Bus Memory Bus DRAM SCM (PMFS mount) DRAM Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 25
Evaluation: Setup TATP benchmark: Simple but realistic OLTP benchmark (20% write, 80% reads) Simulates a telecommunication application Initial population of 1 million subscribers (~2GB data size) Shore-MT: A storage manager designed for scalability in multicores Run on Ramdisk (tempfs) Restart time: Single, 4 column table SCM latency set to 200ns Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 26
Mio. Tx/s Evaluation: Throughput 2 1,8 1,6 GetSubData Throughput (Read Only) Realistic case Upper bound 1,4 1,2 shore-read-shm 1 0,8 0,6 0,4 0,2 User contention. Solved with PLP. Worst case sofort-read-shm sofort-read-200ns sofort-read-300ns sofort-read-700ns 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # users Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 27
Mio. Tx/s Evaluation: Throughput 1,2 1 0,8 0,6 0,4 0,2 TATP Throughput (80% Read, 20% Write) ~900K Tx/s ~550K Tx/s ~250K Tx/s Upper bound Worst case Realistic case shore-shm sofort-shm sofort-200ns sofort-300ns sofort-700ns 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # users SOFORT outperforms Shore-MT even in a high latency SCM environment Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 28
Mio Tx/s Mio Tx/s Restart Time (s) Restart Time (s) 6 4 2 0 10 8 6 4 2 0 Evaluation: Restart Time Dict. Size Fixed 50 100 150 200 250 Mio. Rows/Table SOFORT on SCM ~2sec 0 5 10 15 20 25 30 35 40 45 50 Time (s) Restart time is independent of data size time-dbfixedhashmap time-dictfixedskiplist time-dictfixedhashmap SOFORT recovers nearly instantly Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 29 5 4 3 2 1 0 Data Size Fixed 2 4 6 8 10 Mio. Distinct Values Restart time is independent of instance size and depends only on dictionaries size 8 6 4 2 0 SOFORT on DISK ~18sec 0 5 10 15 20 25 30 35 40 45 50 Time (s) Volatile structures drive restart time time-dbfixedskiplist No persistence guarantees
Conclusion and Future Work We showed that SCM can help getting rid of the transactional log and significantly improving database restart times Latency study based on hardware simulation showed that SOFORT exhibits competitive performance even in a high latency SCM environment Future Work: Extend SOFORT to support long transactions Explore new recovery schemes Design persistent index structures Experiment with write-through caching policy Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery DaMoN 14, June 23 30
Takeaway: SCM is a game changer in the database industry Make the change happen! Thank You! Questions? Comments? Ismail Oukid SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery Ismail Oukid, Daniel Booss, Wolfgang Lehner, Peter Bumbulis, and Thomas Willhalm + DaMoN 2014, Snowbird, Utah, USA, June 23, 2014 Prof. Dr.-Ing. Wolfgang Lehner Diese Zeile ersetzt man über: Einfügen > Kopf- und