Resource Aware Scheduler for Storm. Software Design Document. <jerry.boyang.peng@gmail.com> Date: 09/18/2015

Resource Aware Scheduler for Storm Software Design Document Author: Boyang Jerry Peng Date: 09/18/2015 <jerrypeng@yahoo-inc.com> <jerry.boyang.peng@gmail.com>

Table of Contents 1. INTRODUCTION 3 1.1. USING RESOURCE AWARE SCHEDULER 3 2. API 3 2.1. SETTING MEMORY REQUIREMENT 3 2.2. SETTING CPU REQUIREMENT 4 2.3. LIMITING THE HEAP SIZE PER WORKER (JVM) PROCESS 5 2.4. SETTING AVAILABLE RESOURCES ON NODE 5 2.5. OTHER CONFIGURATIONS 6

1. Introduction The purpose of this document is to provide a description of the Resource Aware Scheduler for the Storm distributed real-time computation system. This document will provide you with both a high level description of the resource aware scheduler in Storm 1.1. Using Resource Aware Scheduler The user can switch to using the Resource Aware Scheduler by setting the following in conf/storm.yaml storm.scheduler: backtype.storm.scheduler.resource.resourceawarescheduler 2. API For a Storm Topology, the user can now specify the amount of resources a topology component (i.e. Spout or Bolt) is required to run a single instance of the component. The user can specify the resource requirement for a topology component by using the following API calls. 2.1. Setting Memory Requirement API to set component memory requirement: public T setmemoryload(number onheap, Number offheap) Number onheap The amount of on heap memory an instance of this component will consume in megabytes Number OffHeap The amount of off heap memory an instance of this component will consume in megabytes The user also have to option to just specify the on heap memory requirement if the component does not have an off heap memory need. public T setmemoryload(number onheap) Number onheap The amount of on heap memory an instance of this component will consume

If no value is provided for offheap, 0.0 will be used. If no value is provided for onheap, or if the API is never called for a component, the default value will be used. Example of Usage: SpoutDeclarer s1 = builder.setspout("word", new TestWordSpout(), 10); s1.setmemoryload(1024.0, 512.0); builder.setbolt("exclaim1", new ExclamationBolt(), 3).shuffleGrouping("word").setMemoryLoad(512.0); The entire memory requested for this topology is 16.5 GB. That is from 10 spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with 0.5 GB on heap memory each. 2.2. Setting CPU Requirement API to set component CPU requirement: public T setcpuload(double amount) Number amount The amount of on CPU an instance of this component will consume. Currently, the amount of CPU resources a component requires or is available on a node is represented by a point system. CPU usage is a difficult concept to define. Different CPU architectures perform differently depending on the task at hand. They are so complex that expressing all of that in a single precise portable number is impossible. Instead we take a convention over configuration approach and are primarily concerned with rough level of CPU usage while still providing the possibility to specify amounts more fine grained. By convention a CPU core typically will get 100 points. If you feel that your processors are more or less powerful you can adjust this accordingly. Heavy tasks that are CPU bound will get 100 points, as they can consume an entire core. Medium tasks should get 50, light tasks 25, and tiny tasks 10. In some cases you have a task that spawns other threads to help with processing. These tasks may need to go above 100 points to express the amount of CPU they are using. If these conventions are followed the common case for a single threaded task the reported Capacity * 100 should be the number of CPU points that the task needs. Example of Usage: SpoutDeclarer s1 = builder.setspout("word", new TestWordSpout(), 10);

s1.setcpuload(15.0); builder.setbolt("exclaim1", new ExclamationBolt(), 3).shuffleGrouping("word").setCPULoad(10.0); 2.3. Limiting the Heap Size per Worker (JVM) Process public void settopologyworkermaxheapsize(number size) Number size The memory limit a worker process will be allocated in megabytes The user can limit the amount of memory resources the resource aware scheduler that is allocated to a single worker on a per topology basis by using the above API. This API is in place so that the users can spread executors to multiple workers. However, spreading workers to multiple workers may increase the communication latency since executors will not be able to use Disruptor Queue for intra-process communication. Example of Usage: Config conf = new Config(); conf.settopologyworkermaxheapsize(512.0); 2.4. Setting Available Resources on Node A storm administrator can specify node resource availability by modifying the conf/storm.yaml file located in the storm home directory of that node. A storm administrator can specify how much available memory a node has in megabytes adding the following to storm.yaml supervisor.memory.capacity.mb: [amount<double>] A storm administrator can also specify how much available CPU resources a node has available adding the following to storm.yaml supervisor.cpu.capacity: [amount<double>] Note: that the amount the user can specify for the available CPU is represented using a point system like discussed earlier.

Example of Usage: supervisor.memory.capacity.mb: 20480.0 supervisor.cpu.capacity: 100.0 2.5. Other Configurations The user can set some default configurations for the Resource Aware Scheduler in conf/storm.yaml: //default value if on heap memory requirement is not specified for a component topology.component.resources.onheap.memory.mb: 128.0 //default value if off heap memory requirement is not specified for a component topology.component.resources.offheap.memory.mb: 0.0 //default value if CPU requirement is not specified for a component topology.component.cpu.pcore.percent: 10.0 //default value for the max heap size for a worker topology.worker.max.heap.size.mb: 768.0