High-Performance Processing of Large Data Sets via Memory Mapping A Case Study in R and C++



Similar documents
The ff package: Handling Large Data Sets in R with Memory Mapped Pages of Binary Flat Files

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

farmerswife Contents Hourline Display Lists 1.1 Server Application 1.2 Client Application farmerswife.com

Your Best Next Business Solution Big Data In R 24/3/2010

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Virtualization. Types of Interfaces

System Requirements Table of contents

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

Lecture 17: Virtual Memory II. Goals of virtual memory

Virtuoso and Database Scalability

Copyright by Parallels Holdings, Ltd. All rights reserved.

Grant Management. System Requirements

Operating Systems: Basic Concepts and History

4.1 Introduction 4.2 Explain the purpose of an operating system Describe characteristics of modern operating systems Control Hardware Access

Outline: Operating Systems

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Terminal Server Software and Hardware Requirements. Terminal Server. Software and Hardware Requirements. Datacolor Match Pigment Datacolor Tools

Enterprise Edition. Hardware Requirements

An Overview of Flash Storage for Databases

PARALLELS SERVER BARE METAL 5.0 README

Professional and Enterprise Edition. Hardware Requirements

Benchmark Results of Fengqi.Asia

INSTALLATION MINIMUM REQUIREMENTS. Visit us on the Web

Hardware and Software Requirements for Installing California.pro

Speeding Up Cloud/Server Applications Using Flash Memory

Software and Hardware Requirements

EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES

Abila Grant Management. System Requirements

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Linux flash file systems JFFS2 vs UBIFS

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory

Instruction Set Architecture (ISA)

Recommended hardware system configurations for ANSYS users

Determining Your Computer Resources

Main Memory & Backing Store. Main memory backing storage devices

Hard Disk Drive vs. Kingston SSDNow V+ 200 Series 240GB: Comparative Test

LabStats 5 System Requirements

PARALLELS SERVER 4 BARE METAL README

AIX NFS Client Performance Improvements for Databases on NAS

Cloud Computing through Virtualization and HPC technologies

12. Introduction to Virtual Machines

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

CPU Organization and Assembly Language

Bringing Big Data Modelling into the Hands of Domain Experts

Binary search tree with SIMD bandwidth optimization using SSE

How To Write A Page Table

September 25, Maya Gokhale Georgia Institute of Technology

A Deduplication File System & Course Review

RightNow CX November 2011 Workstation Specifications

AP ENPS ANYWHERE. Hardware and software requirements

Computer Architecture. Secure communication and encryption.

Benchmarking Hadoop & HBase on Violin

OPERATING SYSTEM - MEMORY MANAGEMENT

Sun 8Gb/s Fibre Channel HBA Performance Advantages for Oracle Database

Faronics Products SYSTEM REQUIREMENTS Last modified: October 2014

Table 1 summarizes the requirements for desktop computers running the Participant Application and the myat&t utility.

Parallels Virtuozzo Containers 4.7 for Linux Readme

Hypervisor Software and Virtual Machines. Professor Howard Burpee SMCC Computer Technology Dept.

Minimum Computer System Requirements

Multi-core Programming System Overview

Evaluating HDFS I/O Performance on Virtualized Systems

USB Flash Drives as an Energy Efficient Storage Alternative

Virtual Machines. COMP 3361: Operating Systems I Winter

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

Large system usage HOW TO. George Magklaras PhD Biotek/NCMM IT USIT Research Computing Services

The Benefits of Using VSwap

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

DESKTOP VIRTUALIZATION SIZING AND COST MODELING PROCESSOR MEMORY STORAGE NETWORK

White Paper Open-E NAS Enterprise and Microsoft Windows Storage Server 2003 competitive performance comparison

SIDN Server Measurements

Running Windows on a Mac. Why?

AirWave 7.7. Server Sizing Guide

Topics in Computer System Performance and Reliability: Storage Systems!

Wide-area Network Acceleration for the Developing World. Sunghwan Ihm (Princeton) KyoungSoo Park (KAIST) Vivek S. Pai (Princeton)

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Fact Sheet In-Memory Analysis

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Simnet Registry Repair User Guide. Edition 1.3

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

Enabling Technologies for Distributed and Cloud Computing

COS 318: Operating Systems. Virtual Memory and Address Translation

BEST PRACTICES FOR PARALLELS CONTAINERS FOR LINUX

Platform Guide. SA Supported Platforms. Service Package Version 7.4R1

Data Centers and Cloud Computing

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

Praktijkexamen met Project VRC. Virtual Reality Check

Gigabit Ethernet Packet Capture. User s Guide

Machine Architecture and Number Systems. Major Computer Components. Schematic Diagram of a Computer. The CPU. The Bus. Main Memory.

CSE 6040 Computing for Data Analytics: Methods and Tools

Virtualization in Linux KVM + QEMU

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Enabling Technologies for Distributed Computing

Retour d expérience : portage d une application haute-performance vers un langage de haut niveau

Platform Guide. SA Supported Platforms. Service Package Version 7.3R1

System Requirements G E N E R A L S Y S T E M R E C O M M E N D A T I O N S

Effects of Memory Randomization, Sanitization and Page Cache on Memory Deduplication

Maximizer CRM 12 Summer 2013 system requirements

Transcription:

High-Performance Processing of Large Data Sets via Memory Mapping A Case Study in R and C++ Daniel Adler, Jens Oelschlägel, Oleg Nenadic, Walter Zucchini Georg-August University Göttingen, Germany - Research Consultant, Munich, Germany Joint Statistical Meeting - 5th of August 2008 - Denver, USA

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space.

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R console-mode (standard edition) on Mac OS X 10.5, 4 GB > numeric(1024^3*2.3/8) Error: cannot allocate vector of size 2.3 Gb

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R console-mode (standard edition) on Mac OS X 10.5, 4 GB > numeric(1024^3*2.3/8) Error: cannot allocate vector of size 2.3 Gb R console-mode on Linux 2.6 (Ubuntu), 2 GB RAM (Parallels VM) > numeric(1024^3*2.4/8) Error: cannot allocate vector of size 2.4 Gb

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R console-mode (standard edition) on Mac OS X 10.5, 4 GB > numeric(1024^3*2.3/8) Error: cannot allocate vector of size 2.3 Gb R console-mode on Linux 2.6 (Ubuntu), 2 GB RAM (Parallels VM) > numeric(1024^3*2.4/8) Error: cannot allocate vector of size 2.4 Gb R on Windows XP, 2 GB RAM (Parallels VM) > numeric(1024^3*2/8) Error: cannot allocate vector of size 2.0 Gb Reached total allocation of 1535Mb.. > memory.limit() [1] 1535.36

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R on 32-Bit Windows Kernel User 4 GB Address Space 2 GB User Space

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R on 32-Bit Windows (tweaked with "/3GB") Kernel User 4 GB Address Space 3 GB User Space

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. R on 64-Bit Windows User 4 GB Address Space 4 GB User Space

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. Goal Enabling work with large data sets on desktop PCs.

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. Goal Enabling work with large data sets on desktop PCs. Idea Data resides on disk storage. Parsimonious use of virtual address space via paging.

Motivation Working with large data sets in R is restricted by virtual memory and the virtual address space. Goal Enabling work with large data sets on desktop PCs. Idea Data resides on disk storage. Parsimonious use of virtual address space via paging. Efficiency problem Disk I/O is slow (1 million times slower than RAM).

Overview 'ff' low-level C++ Library (Daniel) Virtual Memory & Memory-Mapping, Flat Files & Paging 'ff' high-level R Package (Jens) Virtual Atomic Objects, Batch processing, Data types, Hybrid Indexing Performance Page size & System cache, Enhancements Epilog Possible Improvements & Conclusion

Virtual Memory and Memory-Mapping load/store data at location Operating System

Virtual Memory and Memory-Mapping process load/store data at location CPU Operating System RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU Operating System RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation Operating System RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation Operating System RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation Operating System RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation RAM RAM 0 1 2 3 4 5... pagesize (4k = 2^10 bytes)

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation RAM Virtual Memory Address bit 31 bit 9 bit 0 page number (22 bits) offset (10 bits) 0 1 2 3 4 5... pagesize (4k = 2^10 bytes) RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation RAM Virtual Memory Address bit 31 bit 9 bit 0 page number (22 bits) Process Page Table, e.g Page Physical location 200 RAM page 3 201 RAM page 1 202 N/A offset (10 bits) 0 1 2 3 4 5... pagesize (4k = 2^10 bytes) RAM

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation RAM Virtual Memory Address bit 31 bit 9 bit 0 page number (22 bits) Process Page Table, e.g offset (10 bits) swap file 0 1 2 3 4 5 6... Page Physical location 200 RAM page 3 201 Swap file page 5 202 N/A 0 1 2 3 4 5... pagesize (4k = 2^10 bytes) RAM swap in & out

Virtual Memory and Memory-Mapping process load/store data at location CPU virtual to physical address translation RAM Virtual Memory Address ordinary file bit 31 bit 9 bit 0 page number (22 bits) Process Page Table, e.g offset (10 bits) swap file 0 1 2 3 4 5 6... memory mapping Page Physical location 200 RAM page 3 201 Swap page 5 202 RAM page 4, File offset 0 1 2 3 4 5... pagesize (4k = 2^10 bytes) RAM swap in & out

Flat Files and Paging

Flat Files and Paging Data resides on hard disk binary flat files.

Flat Files and Paging Data resides on hard disk binary flat files. One fixed-size file section is mapped into the process's virtual address space.

Flat Files and Paging Data resides on hard disk binary flat files. One fixed-size file section is mapped into the process's virtual address space. The file section offset is moveable.

Flat Files and Paging Data resides on hard disk binary flat files. One fixed-size file section is mapped into the process's virtual address space. The file section offset is moveable. Modified sections are written back to disk.

Flat Files and Paging Data resides on hard disk binary flat files. One fixed-size file section is mapped into the process's virtual address space. The file section offset is moveable. Modified sections are written back to disk. Virtual address space costs = section size.

Virtual atomic R objects

Virtual atomic R objects Creating vectors, matrices, arrays and factors. > vec <- ff(vmode= double,length=100000000) > mat <- ff(vmode= double,dim=c(5000,6000)) > arr <- ff(vmode= integer,dim=c(10,200,300)) > fac <- ff(vmode="integer",levels=c('a','b'),length=10e6))

Virtual atomic R objects Creating vectors, matrices, arrays and factors. > vec <- ff(vmode= double,length=100000000) > mat <- ff(vmode= double,dim=c(5000,6000)) > arr <- ff(vmode= integer,dim=c(10,200,300)) > fac <- ff(vmode="integer",levels=c('a','b'),length=10e6)) Standard subsetting in R > vec[1:1000] <- rnorm(1000) > sum(mat[c(1,3,4),]) > arr[5:1,100,150] > fac[] <- 'B'

Virtual atomic R objects Creating vectors, matrices, arrays and factors. > vec <- ff(vmode= double,length=100000000) > mat <- ff(vmode= double,dim=c(5000,6000)) > arr <- ff(vmode= integer,dim=c(10,200,300)) > fac <- ff(vmode="integer",levels=c('a','b'),length=10e6)) Standard subsetting in R > vec[1:1000] <- rnorm(1000) > sum(mat[c(1,3,4),]) > arr[5:1,100,150] > fac[] <- 'B' Batch processing > s <- 0 > ffvecapply( s <<- s + sum(vec[i1:i2]), X=vec ) > mymean <- s/length(vec)

data types and packing

data types and packing vmode size R mode NA handling range boolean 1 bit logical TRUE,FALSE logical 2 bit logical NA TRUE,FALSE quad 2 bit integer 0:3 nibble 4 bit integer 0:15 byte 8 bit integer NA -127:+127 ubyte 8 bit integer 0:255 short 16 bit integer NA -32767:+32767 ushort 16 bit integer 0:65535 integer 32 bit integer NA -(2^31-1):+(2^31-1) single 32 bit double NA C float double 64 bit double NA C double raw 8 bit raw 0:255

Hybrid Indexing

Hybrid Indexing index expressions are packed (if possible) before evaluated. Saves space! > vec[1:(length(vec)/2)] <- 1

Hybrid Indexing index expressions are packed (if possible) before evaluated. Saves space! > vec[1:(length(vec)/2)] <- 1 otherwise... > 1:(length(vec)/2) cannot allocate vector of size 2.0 Gb

Hybrid Indexing index expressions are packed (if possible) before evaluated. Saves space! > vec[1:(length(vec)/2)] <- 1 from to by 1 length(vec)/2 1 otherwise... > 1:(length(vec)/2) cannot allocate vector of size 2.0 Gb

Page size and system cache 0 50 100 150 2 GB double ff vector, 10,000,000 random accesses, Intel Mac OS X 10.5, 4 GB, 2.5 GhZ 64 mb randomwrite mmeachflush randomwrite mmnoflush 64 kb randomread mmeachflush randomread mmnoflush 64 mb 64 kb pagesize seqwrite mmeachflush seqwrite mmnoflush 64 mb 64 kb seqread mmeachflush seqread mmnoflush 64 mb 64 kb 0 50 100 150 elapsed

Performance Enhancements Presorting indices in ascending order to minimize disk head movements. Fast creation of flat files. Using system cache to prevent Disk I/O. Increase page size to reduce pagings. Exploit parallelism; Flat files are shareable among multiple R processes.

Possible improvements Increase index resolution to 52 bits in R. Support for mixed-type data frames. On-demand presorting indices. Automatic adjustments of system cache usage and page size. Paging Garbage Collector?

Conclusion Memory-mapping in contrast to streambased Disk I/O has advantage of exploiting system cache and - at the same time - allow to share pages among multiple processes. While system cache enabled will also consume physical memory it still does not consume more virtual address space.

Availability of the 'ff' package Version 2.0.0 on CRAN (since Monday) GPL-2, C++ Library ISCL (BSD style) Web resources: http://134.76.173.220/ff * http://www.truecluster.com/ff.htm * contains version 1(64 bit internal indexing), slides, datasets