In-Memory Computing. New Architecture for Big Data Processing. Hai Jin

In-Memory Computing New Architecture for Big Data Processing Hai Jin Cluster and Grid Computing Lab Services Computing Technology and System Lab Big Data Technology and System Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan, 430074, China hjin@hust.edu.cn

Big Data: More Than Big

Varieties of Data Internet data Scientific data Multimedia data Enterprise/IoT data

Characteristics of Enterprise/IoT Data Regular structure and expression Rapid growth Timeliness data processing High value density High security requirement

In-Memory Computing

Gartner Top 10 Strategic Technology Trends for 2013

Overview of SAP HANA

Oracle s Big Data Appliance/Exalytics

Energy Consumption in Modern Electronic ICT Systems

DRAM Read Power vs. Data Rate

Downsides of DRAM Refresh Energy consump.on: Each refresh consumes energy Performance degrada.on: DRAM bank unavailable while refreshed QoS/predictability impact: (Long) pause.mes during refresh Refresh rate limits DRAM capacity scaling

Refresh Overhead: Performance 46% 8%

Refresh Overhead: Energy 47% 15%

Emerging Non-volatile Memory (NVM) Technologies 15/11/10

Intel-Micron 3D XPoint Technology 15/11/10

International Technology Roadmap for Semiconductors Projection

Intel-Micron 3D Xpoint Technology 15/11/10

New Advances on Memory and Storage -Storage Class Memory

Storage Class Memory SCM is a new class of data storage and memory devices SCM blurs the dis.nc.on between MEMORY (= fast, expensive, vola/le ) and STORAGE (= slow, cheap, non- vola/le) Characteris.cs of SCM Solid state, no moving parts Short access.mes (~ DRAM like, within an order- of- magnitude) Low cost per bit (DISK like, within an order- of- magnitude) Non- vola.le ( ~ 10 years)

20 Emerging NVM Technological Choice Time Slots by Application

Reconstruction of Virtual Memory Architecture: Break the I/O Bottleneck 21

New In-Memory Computing Architecture Build DRAM + SCM hybrid hierarchical/parallel memory structure Memory Channel I/O Channel Disk Memory Channel Data I/O Channel New In-memory Computing Architecture The new in-memory computing architecture supports the shift from computing-centric to combination of computing and data 22

100x Faster Access to Data using SCM

HP Nanostore Project

The Machine - Memory-Driven Computing

The Machine Roadmap

Technology and System of In-Memory Computing for Big Data Processing Project Overview Total budget: 170M RMB Period: January 2015 December 2017 Participants Inspur Huawei Sugon Shanghai Jiaotong University Huazhong University of Science and Technology Chongqing University National University of Defense Technology Huadong Normal University Jiangnan Computing Institute 28

Technology and System of In-Memory Computing for Big Data Processing Project Mission Hybrid NVM- based high reliable, massive storage, and low power in- memory compu.ng system, the capacity of NVM in each node should be in TB level, suppor.ng zero bootup System soyware and simula.on plazorm for in- memory compu.ng system Parallel processing system for in- memory compu.ng system In- memory database for hybrid memory architecture to support decision making and other data 29 management applica.ons

System Architecture

Full System Simulator Applications Core Core Core Core Latency-Aware Shared Cache Data Migration Service Channel Redirect Mapping PCM Libs OS Migrator DRAM Software Level Migration Libs Hardware Level Migration Strategies Designed Based On: MARSSX86+ NVMain More flexible (supports NVMain DRAMSim HybridSim as main memory simulator) Simulate memory system precisely, easy to configure

Data Migration Data Placement determines Access Latency, Energy Consumption, Device Lifetime Explicit Migration allocate and migrate virtual page specified by programmer DRAM #include lib/migrate.hh p1 int main() { Migrate_Init(); p1 = malloc(size_1); p2 p2 = malloc(size_2); Migrate(p1,size_1,PCM_MEM); Migrate(p2,size_2,DRAM_MEM); } Virtual address space PCM migration Physical address space

Data Migration Implicit Migration allocate virtual pages automatically and migrate them later Low access latency Low energy consumption Core Core Recording Access Information Page Migration Classifying Pages Energy Model PCM DRAM

Miss Penalty Aware Cache Replacement On a LLC miss Low latency: Row Buffer Hit, Access DRAM High latency: Access PCM Miss Penalty Aware Policy Balancing Data Locality and Miss Latency

Flint: Exploiting Raw Data of In-Memory Data Objects in Distributed Data-Parallel Systems l Completely decomposing in-memory data objects to eliminate the references invocation and reduce frequent garbage collection in JVM l A system to automatic converse the user codes, decompose data objects and manage in-memory raw data l Speedup from 2x to 5.67x Compare with Apache Spark App. LR KMeans PR CC Bagel- PR Bagel- CC WC Spark 15.7s 35s 930s 90s 276.6s 204s 900s Flint 6s 10s 237s 36s 56s 36s 426s Speedup 2.61x 3.5x 3.92x 2.5x 4.94x 5.67x 2.11x

Mammoth: Memory-Centric MapReduce System A novel rule- based heuris.c to priori.ze memory alloca.on and revoca.on mechanism A mul.- threaded execu.on engine, which realize global memory management Compa.ble with Hadoop Sources available at Github and ASF IEEE Computer Spotlight Execution Engine 10 36

Landscape of Disk- Based and In- Memory Data Management Systems (2014) 37

Thanks!