CactoScale Guide User Guide. Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB)

CactoScale Guide User Guide Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB)

Version History Version Date Change Author 0.1 12/10/2014 Initial version Athanasios Tsitsipas(UULM) 0.2 14/01/2015 Added description and install notes Papazachos Zafeirios(QUB), Sakil Barbhuiya(QUB) 0.3 23/10/2015 Change version of tools add new install notes Athanasios Tsitsipas(UULM)

TABLE OF CONTENTS TABLE OF CONTENTS I 1. PURPOSE 1 2. OVERVIEW 1 3. PREREQUISITES 1 4. INSTALLATION OF MONITORING FRAMEWORK 1 A) INSTALLING MONITORING CLUSTER 2 B) STEP-BY-STEP - ADD NEW NODE TO EXISTING CLUSTER 3 C) FORWARD REQUESTS TO PORT THROUGH THE MONITORING-GATEWAY VM TO A NODE 3 D) START / STOP THE MONITORING CLUSTER 3 E) INSTALL AND START CHUKWA COLLECTOR 4 F) START CHUKWA AGENT ON A PHYSICAL MACHINE 4 5. START THE RUNTIME MODEL UPDATER 4 6. STEP-BY-STEP OFFLINE ANALYSIS GUIDELINE 4 A) CREATE HBASE SCHEMA TABLES 5 B) IMPORTING STRACE DATA TO HBASE 5 C) ANALYSING STRACE DATA STORED IN THE HBASE 6 D) CSV GENERATION FROM THE ANALYSIS RESULTS 6 E) TROUBLESHOOTING 7 i P a g e C a c t o S c a l e G u i d e C a c t o s

1. PURPOSE This document is a complete guide how to use CactoScale, including steps how to install, completely from scratch, the monitoring framework and start the Runtime Model Updater (D4.3 Parallel Trace Analysis). Finally, it presents instructions of executing the analysis and result of Pig scripts, upon existing trace data from system calls of chemical computations done with Molpro 1. The traces provided by the University of Ulm. 2. OVERVIEW In (D4.1 Data Collection Framework), are described in depth the tools that CactoScale utilizes. CactoScale features extensible monitoring capabilities which allow the tracking of a variety of resources such as embedded sensors, external instrumentation devices, hardware counters, error log files, workload traces, network, processor core, memory, storage, and application logs. Additionally, provides data filtering and correlation analysis tools, which are designed to run on vast volumes of data generated from potentially thousands of servers. These capabilities in turn enable CACTOS to address challenges in managing resources of increased complexity and heterogeneity in cloud infrastructures. 3. PREREQUISITES The required versions of the utilized tools in CactoScale for the current guide are: i. Hadoop version: 2.6.0 ii. Zookeeper version: 3.4.6 iii. HBase version: 1.1.1 iv. Pig version: 0.12.1 v. Chukwa version: 0.5.0 Except the required technologies a running CDO server 2 is needed. 4. INSTALLATION OF MONITORING FRAMEWORK The following instructions is based on four virtual machines: monitoring gateway: used for accessing the monitoring cluster the only publicly accessible vm! monitoring01: HDFS namenode, HDFS datanode, HBase master monitoring02: HDFS secondarynamenode, HDFS datanode, HBase regionserver monitoring03: HDFS datanode, HBase regionserver. All vms have key-based ssh access to each other The above cluster setup 3 is maintained also in the Openstack Testbed of the University of Ulm. 1 https://www.molpro.net/ 2 http://www.cactosfp7.eu/2015/04/03/cactos-blog-setting-secure-cdo-server/ 3 http://www.cactosfp7.eu/2015/08/26/openstack-physical-testbed-part-3-staying-up-to-datesoftware-requirements/ 1 P a g e C a c t o S c a l e G u i d e C a c t o s

a) INSTALLING MONITORING CLUSTER For the multi-node setup, make sure to setup key-based ssh authentication between all nodes first. Also, set up and /etc/hosts correctly on all nodes. On a fresh Centos 7 VM, do the following steps. yum install svn mkdir cactoscale cd cactoscale svn checkout https://svn.fzi.de/svn/cactos/code/scale/trunk/cactoscale_monitoring_framework/. export SVNCHECKOUT=./cactoscale PREPARE THE SETUP ON FIRST NODE # install required packages first yum install epel-release wget vim java-openjdk # download hadoop and hbase binaries cd ~ wget http://mirror.softaculous.com/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz tar xfzv hadoop-2.6.0.tar.gz wget http://ftp-stud.hs-esslingen.de/pub/mirrors/ftp.apache.org/dist/hbase/1.1.1/hbase-1.1.1-bin.tar.gz tar xfzv hbase-1.1.1-bin.tar.gz # copy helper scripts cp $SVNCHECKOUT/hCluster/bin/*. chmod +x./*.sh CONFIGURE THE SETUP # place the config files from this repo cp $SVNCHECKOUT/hCluster/conf/hadoop/* ~/hadoop-2.6.0/etc/hadoop/ cp $SVNCHECKOUT/hCluster/conf/* ~/hbase-1.1.1/conf/ Change the following configuration files as needed: hadoop: core-site.xml, line 46, property "fs.default.name" value "hdfs://monitoring01:8020" hadoop: dfs-hosts, line 1, add hostname of namenode(s) hadoop: hdfs-site.xml, line330, property "dfs.https.address" value "monitoring01:50470" hadoop: slaves, add hostnames for data nodes hbase: hbase-site.xml, line 25, property "hbase.rootdir" value "hdfs://192.168.0.3:8020/hbase" hbase: hbase-site.xml, line 37, property "hbase.zookeeper.quorum" value "list of all hbase nodes" hbase: regionservers, add hostnames for region servers 2 P a g e C a c t o S c a l e G u i d e C a c t o s

# format the hdfs root dir cd ~/hadoop-2.6.0 bin/hdfs namenode format b) STEP-BY-STEP - ADD NEW NODE TO EXISTING CLUSTER PREPARE THE NODE Log in to the new node. Set up key-based ssh login and /etc/hosts. yum install epel-release wget vim java-openjdk ADD NEW NODE TO CONFIGURATION Log in to the first node. Edit the settings: hadoop: slaves, add hostnames for data nodes hbase: hbase-site.xml, line 37, property "hbase.zookeeper.quorum" value "list of all hbase nodes" hbase: regionservers, add hostnames for region servers COPY SETUP FROM FIRST NODE TO NEW NODE Log in to the first node. Use the helper script. # edit the distribute script $SVNCHECKOUT/hCluster/bin/distribute.sh <hostname_of_new_node> Binaries and configuration are copied and extracted now. c) FORWARD REQUESTS TO PORT THROUGH THE MONITORING-GATEWAY VM TO A NODE MAKE SURE TO HAVE IPTABLES INSTALLED, IF NOT RUN yum install iptables-services EXECUTE THE FOLLOWING RULES IN A TERMINAL sysctl net.ipv4.ip_forward=1 iptables -t nat -A PREROUTING -p tcp --dport <port> -j DNAT --to-destination <monitoring01 ip>: <port> iptables -t nat -A POSTROUTING -j MASQUERADE iptables -I FORWARD -p tcp --dport <port> -j ACCEPT d) START / STOP THE MONITORING CLUSTER Log in to the master node (monitoring-gateway) via ssh. $SVNCHECKOUT/hCluster/bin/startHStuff.sh will start the local master services AND the slave services via ssh $SVNCHECKOUT/hCluster/bin/stopHStuff.sh will stop the local master services AND the slave services via ssh 3 P a g e C a c t o S c a l e G u i d e C a c t o s

e) CREATE REQUIRED HBASE TABLES Log on at any virtual machine of the monitoring cluster and execute the below command to create the required HBase tables for CactoScale to store the required information. hbase shell $SVNCHECKOUT/chukwa/chukwa-cactos/etc/chukwa/cactos-hbase.schema f) INSTALL AND START CHUKWA COLLECTOR On the desired machine of the monitoring cluster, e.g monitoring01, execute the following command: $SVNCHECKOUT/chukwa/chukwa-collector.sh start Make sure to have accessible the port 8080 where the chukwa collector runs. g) START CHUKWA AGENT ON A PHYSICAL MACHINE A node that needs monitoring, start a chukwa agent by executing the following: yum install svn mkdir cactoscale cd cactoscale svn checkout https://svn.fzi.de/svn/cactos/code/scale/trunk/cactoscale_monitoring_framework/. export SVNCHECKOUT=./cactoscale $SVNCHECKOUT/chukwa/chukwa-agent.sh start Prior to the last command, the collector IP must be set at the file located at: $SVNCHECKOUT/chukwa/chukwa-cactos/etc/chukwa/collectors 5. START THE RUNTIME MODEL UPDATER Starting the Runtime Model Updater became an easy task by executing the commands below from a desired node: wget https://sdqweb.ipd.kit.edu/eclipse/cactos/cactoscale/runtimemodelupdater/binary_nightly/runtimemodelupdat er.gtk.linux.x86_64.zip unzip -q RuntimeModelUpdater.gtk.linux.x86_64.zip svn checkout https://svn.fzi.de/svn/cactos/code/integration/trunk/eu.cactosfp7.configuration/ Until now, the product folder and the configuration folder were obtained. Before starting the Runtime Model Updater, information in the files cactoscale_model_updater.cfg and integration_cdosession.cfg in the folder eu.cactosfp7.configuration must be filled according to the naming of the variables. After a successful configuration the following commants must be executed in order to start the Runtime Model Updater. cd RuntimeModelUpdater.gtk.linux.x86_64 screen -dms modelupdater bash -c "./RuntimeModelUpdater" 6. STEP-BY-STEP OFFLINE ANALYSIS GUIDELINE The collected strace output data from Molpro obtained for different system scenarios. The available dataset traces, are fully described at (D4.2 Preliminary Offline Trace Analysis), in chapter III. For this guideline the 4 P a g e C a c t o S c a l e G u i d e C a c t o s

configuration for the two strace log files is separated by the storage type of machines (HDD, SSD) and the files are named strace.out_01 and strace.out_02 respectively. a) CREATE HBASE SCHEMA TABLES In order to create the tables, use the provided schemas in the 1_Hbase_schema folder. The execution is as follows: Usage: sudo u hbase shell <localsrc> Example: sudo u hbase /tmp/1_hbase_schema/ulm_strace_import.schema The above applies also for the file ulm_strace_analysis_result.schema. The raw strace output files need to be imported to HBase tables (ulm_strace_import.schema script) and the results from the analysis scripts must be stored in HBase in different tables (ulm_strace_analysis_result.schema). b) IMPORTING STRACE DATA TO HBASE The instructions for this step are the following: i. These two files (strace.out_01, strace.out_02) have to be uploaded first to HDFS. In order to upload them execute the command: Usage: sudo u hdfs hadoop fs -put <localsrc> <HDFS_dest_Path> Example: sudo u hdfs hadoop fs put /tmp/strace.out_01 /tmp/ Execute the same command for the strace.out_02 log file. Having both files uploaded to HDFS, the following instructions can be carried out. Moreover, more information about the traces can be found at (D4.2 Preliminary Offline Trace Analysis), in chapter IV. ii. iii. Edit the pig script storenewstracedata.pig (in the 2_Import_logs folder) by configuring the path of the myudf.jar provided in the same folder. Run the pig script storenewstracedata.pig by running the following command line (change versions according to the installation of the tools) in order to execute the import pig script (storenewstracedata.pig): Usage: <exec file of pig> -Dpig.additional.jars=<Path to hbase jar>:<path to zookeeper jar>:<path to pig jar>:<path to hbase-env.jar of chukwa> <localsrc of script> Example: pig -Dpig.additional.jars=/usr/lib/hbase/hbase-1.1.1.jar:/usr/lib/zookeeper/zookeeper-3.4.6.jar:/var/pig/pig- 0.12.1/pig-0.12.1.jar:/home/chukwa/hbase-env.jar /tmp/2_import_logs/storenewstracedata.pig 5 P a g e C a c t o S c a l e G u i d e C a c t o s

NOTE: Run the pig script storenewstracedata.pig twice, firstly for strace.out_01 and secondly for strace.out_02 (simply search for the LOAD command in the pig script and change the file name). c) ANALYSING STRACE DATA STORED IN THE HBASE Having the imported data analysis scripts will be executed upon these to get meaningful results (more information at the (D4.2 Preliminary Offline Trace Analysis) Section V.1). Run the following commands (change versions according to the installation of the tools) in order to execute the following analytic pig scripts separately stracedataanalytic_perjob.pig, stracedataanalytic_timeseries.pig and stracedataanalytic_variance.pig from the 3_Analysis folder: Usage: <exec file of pig> -Dpig.additional.jars=<Path to hbase jar>:<path to zookeeper jar>:<path to pig jar>:<path to hbase-env.jar of chukwa> <localsrc of script> Example: pig -Dpig.additional.jars=/usr/lib/hbase/hbase-1.1.1.jar:/usr/lib/zookeeper/zookeeper-3.4.6.jar:/var/pig/pig- 0.12.1/pig-0.12.1.jar:/home/chukwa/hbase-env.jar /tmp/3_analysis/ stracedataanalytic_perjob.pig NOTE: Run each analytic script twice, firstly for strace.out_01 and secondly for strace.out_02 (simply search for the FILTER command in the pig script and change the file name; also search for the STORE command and change the HBase table name extension between _01 and _02) ATTENTION: In every analytic pig script there is a parameter that could be defined, named $START. This is for the minkeyval in order to return the rows with rowkeys greater than minkeyval. But, because we want to return from HBase table all the rows and not any specific rows, we can ignore this parameter by deleting the code part, -gt $START. d) CSV GENERATION FROM THE ANALYSIS RESULTS Creating CSV files from the analytic results stored in the HBase, R scripts are finally used to produce the result graphs from these CSV files (more information at the (D4.2 Preliminary Offline Trace Analysis) section V.2). Open the pig scripts in the 4_CSV_Generation folder for creating CSV files and change the HBase table name extensions between _01 and _02 to run the scripts twice (one for strace_output_01 and other for strace_output_02). Also, in the script change the location where to save the CSV file and the name of the CSV file. Execute the following result scripts separately meanvalueresultcsv.pig, perjobresultcsv.pig, sizedistributionresultcsv.pig, standarddeviationresultcsv.pig, timeseriesresultcsv.pig. Run each script like as follows: Usage: <exec file of pig> -Dpig.additional.jars=<Path to hbase jar>:<path to zookeeper jar>:<path to pig jar>:<path to hbase-env.jar of chukwa> <localsrc of script> Example: pig -Dpig.additional.jars=/usr/lib/hbase/hbase-1.1.1.jar:/usr/lib/zookeeper/zookeeper-3.4.6.jar:/var/pig/pig- 0.12.1/pig-0.12.1.jar:/home/chukwa/hbase-env.jar /tmp/4_csv_generation/ meanvalueresultcsv.pig 6 P a g e C a c t o S c a l e G u i d e C a c t o s

e) TROUBLESHOOTING Several issues a user might face during the execution of scripts. Either environmental or actual false execution of scripts, below there are possible issues and given solutions. The scripts provided are fine tested and executed in order to produce the expected results. 1. Problem: Error: JAVA_HOME is not set. Solution: Export the environmental variable e.g export JAVA_HOME=/etc/alternatives/jre:$JAVA_HOME 2. Problem: By running an analysis script get the error log below: Or [main] ERROR org.apache.pig.tools.grunt.grunt - java.lang.noclassdeffounderror: org/apache/hadoop/hbase/filter/writablebytearraycomparable [main] ERROR org.apache.pig.tools.grunt.grunt - java.lang.noclassdeffounderror: org/apache/hadoop/hbase/mapreduce/tableoutputformat Solution: Even the HBase jar is declared as a parameter to the execution of pig script the system need the environmental variable HBASE_HOME e.g export HBASE_HOME=/usr/lib/hbase:$HBASE_HOME. 3. Problem: pig: command not found Solution: Either give the actual path where the pig executable is located, when try to run the scripts or declare an environmental variable e.g export PIG_CLASSPATH=/var/pig/pig-0.12.1/bin:$PIG_CLASSPATH. 4. Problem: By running an analysis script get the error log below: WARN snappy.loadsnappy: Snappy native library not loaded Solution: This error message will appear if the shared library (.so) for snappy is not located in the hadoop native library path. If you have the libraries installed in the correct location then you shouldn't see the above error messages. Try e.g ln -sf /usr/lib64/libsnappy.so /usr/lib/hadoop/lib/native/linux-amd64-64/. 7 P a g e C a c t o S c a l e G u i d e C a c t o s