In-Memory Computing New Architecture for Big Data Processing Hai Jin Cluster and Grid Computing Lab Services Computing Technology and System Lab Big Data Technology and System Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan, 430074, China hjin@hust.edu.cn
Big Data: More Than Big
Varieties of Data Internet data Scientific data Multimedia data Enterprise/IoT data
Characteristics of Enterprise/IoT Data Regular structure and expression Rapid growth Timeliness data processing High value density High security requirement
In-Memory Computing
Gartner Top 10 Strategic Technology Trends for 2013
Overview of SAP HANA
Oracle s Big Data Appliance/Exalytics
Energy Consumption in Modern Electronic ICT Systems
DRAM Read Power vs. Data Rate
Downsides of DRAM Refresh Energy consump.on: Each refresh consumes energy Performance degrada.on: DRAM bank unavailable while refreshed QoS/predictability impact: (Long) pause.mes during refresh Refresh rate limits DRAM capacity scaling
Refresh Overhead: Performance 46% 8%
Refresh Overhead: Energy 47% 15%
Emerging Non-volatile Memory (NVM) Technologies 15/11/10
Intel-Micron 3D XPoint Technology 15/11/10
International Technology Roadmap for Semiconductors Projection
Intel-Micron 3D Xpoint Technology 15/11/10
New Advances on Memory and Storage -Storage Class Memory
Storage Class Memory SCM is a new class of data storage and memory devices SCM blurs the dis.nc.on between MEMORY (= fast, expensive, vola/le ) and STORAGE (= slow, cheap, non- vola/le) Characteris.cs of SCM Solid state, no moving parts Short access.mes (~ DRAM like, within an order- of- magnitude) Low cost per bit (DISK like, within an order- of- magnitude) Non- vola.le ( ~ 10 years)
20 Emerging NVM Technological Choice Time Slots by Application
Reconstruction of Virtual Memory Architecture: Break the I/O Bottleneck 21
New In-Memory Computing Architecture Build DRAM + SCM hybrid hierarchical/parallel memory structure Memory Channel I/O Channel Disk Memory Channel Data I/O Channel New In-memory Computing Architecture The new in-memory computing architecture supports the shift from computing-centric to combination of computing and data 22
100x Faster Access to Data using SCM
HP Nanostore Project
The Machine - Memory-Driven Computing
The Machine - Memory-Driven Computing
The Machine Roadmap
Technology and System of In-Memory Computing for Big Data Processing Project Overview Total budget: 170M RMB Period: January 2015 December 2017 Participants Inspur Huawei Sugon Shanghai Jiaotong University Huazhong University of Science and Technology Chongqing University National University of Defense Technology Huadong Normal University Jiangnan Computing Institute 28
Technology and System of In-Memory Computing for Big Data Processing Project Mission Hybrid NVM- based high reliable, massive storage, and low power in- memory compu.ng system, the capacity of NVM in each node should be in TB level, suppor.ng zero bootup System soyware and simula.on plazorm for in- memory compu.ng system Parallel processing system for in- memory compu.ng system In- memory database for hybrid memory architecture to support decision making and other data 29 management applica.ons
System Architecture
Full System Simulator Applications Core Core Core Core Latency-Aware Shared Cache Data Migration Service Channel Redirect Mapping PCM Libs OS Migrator DRAM Software Level Migration Libs Hardware Level Migration Strategies Designed Based On: MARSSX86+ NVMain More flexible (supports NVMain DRAMSim HybridSim as main memory simulator) Simulate memory system precisely, easy to configure
Data Migration Data Placement determines Access Latency, Energy Consumption, Device Lifetime Explicit Migration allocate and migrate virtual page specified by programmer DRAM #include lib/migrate.hh p1 int main() { Migrate_Init(); p1 = malloc(size_1); p2 p2 = malloc(size_2); Migrate(p1,size_1,PCM_MEM); Migrate(p2,size_2,DRAM_MEM); } Virtual address space PCM migration Physical address space
Data Migration Implicit Migration allocate virtual pages automatically and migrate them later Low access latency Low energy consumption Core Core Recording Access Information Page Migration Classifying Pages Energy Model PCM DRAM
Miss Penalty Aware Cache Replacement On a LLC miss Low latency: Row Buffer Hit, Access DRAM High latency: Access PCM Miss Penalty Aware Policy Balancing Data Locality and Miss Latency
Flint: Exploiting Raw Data of In-Memory Data Objects in Distributed Data-Parallel Systems l Completely decomposing in-memory data objects to eliminate the references invocation and reduce frequent garbage collection in JVM l A system to automatic converse the user codes, decompose data objects and manage in-memory raw data l Speedup from 2x to 5.67x Compare with Apache Spark App. LR KMeans PR CC Bagel- PR Bagel- CC WC Spark 15.7s 35s 930s 90s 276.6s 204s 900s Flint 6s 10s 237s 36s 56s 36s 426s Speedup 2.61x 3.5x 3.92x 2.5x 4.94x 5.67x 2.11x
Mammoth: Memory-Centric MapReduce System A novel rule- based heuris.c to priori.ze memory alloca.on and revoca.on mechanism A mul.- threaded execu.on engine, which realize global memory management Compa.ble with Hadoop Sources available at Github and ASF IEEE Computer Spotlight Execution Engine 10 36
Landscape of Disk- Based and In- Memory Data Management Systems (2014) 37
Thanks!