Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter. Search Discover Analyze



Similar documents
HDFS Cluster Installation Automation for TupleWare

Building a Scalable News Feed Web Service in Clojure

THE EUCALYPTUS OPEN-SOURCE PRIVATE CLOUD

Automate Your BI Administration to Save Millions with Command Manager and System Manager

Tcl and Cloud Computing Automation

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

Spectrum Scale HDFS Transparency Guide

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

ZooKeeper Administrator's Guide

Search and Real-Time Analytics on Big Data

Blackboard Open Source Monitoring

Last time. Today. IaaS Providers. Amazon Web Services, overview

Automated Application Provisioning for Cloud

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

Amazon Elastic Beanstalk

Jenkins: The Definitive Guide

RTI Quick Start Guide for JBoss Operations Network Users

Grinder in the Cloud. Get Loaded!

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Backup and Recovery of SAP Systems on Windows / SQL Server

Technical Overview Simple, Scalable, Object Storage Software

Ansible in Depth WHITEPAPER. ansible.com

SUSE Manager in the Public Cloud. SUSE Manager Server in the Public Cloud

Cloud Computing. Adam Barker

INSTALLING KAAZING WEBSOCKET GATEWAY - HTML5 EDITION ON AN AMAZON EC2 CLOUD SERVER

Managing your Red Hat Enterprise Linux guests with RHN Satellite

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

2) Xen Hypervisor 3) UEC

Getting Started with Amazon EC2 Management in Eclipse

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

ur skills.com

This document will list the ManageEngine Applications Manager best practices

MANAGE YOUR AMAZON AWS ASSETS USING BOTO

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

opennms reporting generation tool

GeoCloud Project Report GEOSS Clearinghouse

Oracle WebLogic Server 11g Administration

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Savanna Hadoop on. OpenStack. Savanna Technical Lead

Data Centers and Cloud Computing

HP OO 10.X - SiteScope Monitoring Templates

XpoLog Competitive Comparison Sheet

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS

ITG Software Engineering

Integration of IT-DB Monitoring tools into IT General Notification Infrastructure

Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine. Ronny L. Bull Project Writeup For: CS644 Clarkson University

Building a Private Cloud Cloud Infrastructure Using Opensource

McAfee Public Cloud Server Security Suite

Introduction to Openstack, an Open Cloud Computing Platform. Libre Software Meeting

Cloudera Manager Training: Hands-On Exercises

Eucalyptus User Console Guide

Vistara Lifecycle Management

Zabbix for Hybrid Cloud Management

Chapter 2 SYSTEM MANAGEMENT. SYS-ED/ Computer Education Techniques, Inc.

User Guide for VMware Adapter for SAP LVM VERSION 1.2

Deploying Hadoop with Manager

CONDOR CLUSTERS ON EC2

Red Hat enterprise virtualization 3.0 feature comparison

Improved metrics collection and correlation for the CERN cloud storage test framework

Comsol Multiphysics. Running COMSOL on the Amazon Cloud. VERSION 4.3a

Zend Server Amazon AMI Quick Start Guide

Bigdata High Availability (HA) Architecture

ArcGIS for Server in the Amazon Cloud. Michele Lundeen Esri

XpoLog Center Suite Data Sheet

TECHNOLOGY WHITE PAPER Jun 2012

ArcGIS for Server: Administrative Scripting and Automation

Cloud computing - Architecting in the cloud

Getting Started with Hadoop with Amazon s Elastic MapReduce

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

AdWhirl Open Source Server Setup Instructions

WebLogic Server Administration

Installation of PHP, MariaDB, and Apache

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

<Insert Picture Here> Managing WebLogic Server Lifecycle

Log management with Logstash and Elasticsearch. Matteo Dessalvi

FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre

Comsol Multiphysics. Running COMSOL on the Amazon Cloud. VERSION 4.3b

Assignment # 1 (Cloud Computing Security)

Using SUSE Studio to Build and Deploy Applications on Amazon EC2. Guide. Solution Guide Cloud Computing.

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Using The Hortonworks Virtual Sandbox

13.1 Backup virtual machines running on VMware ESXi / ESX Server

Technical Report. Implementation and Performance Testing of Business Rules Evaluation Systems in a Computing Grid. Brian Fletcher x

Drupal in the Cloud. Scaling with Drupal and Amazon Web Services. Northern Virginia Drupal Meetup

Big data blue print for cloud architecture

Department of Veterans Affairs VistA Integration Adapter Release Enhancement Manual

Integrating VoltDB with Hadoop

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

Hadoop Setup. 1 Cluster

Cloud Computing Now and the Future Development of the IaaS

CI Pipeline with Docker

Transcription:

Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter Search Discover Analyze

My SolrCloud Experience Currently, working on scaling up to a 200+ node deployment at LucidWorks Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) Contributed several tests and patches to the code base Built a Fabric/boto framework for deploying and managing a cluster in EC2 Co-author of Solr In Action; wrote CH 13 which covers SolrCloud

Solr Scaling Toolkit Requirements High-level overview Nuts and Bolts (live demo) Roadmap Q&A

Tasks to Automate Provisioning N machine instances in EC2 Configuring / starting ZooKeeper (1 to n servers) Configuring / starting N Solr instances in cloud mode (M x N nodes) Integrating with Logstash4Solr and other supporting services, e.g. collectd Day-to-day operations on an existing cluster

Python-based Tools boto Python API for AWS (EC2, S3, etc) Fabric Python-based tool for automating system admin tasks over SSH pysolr Python library for Solr (sending commits, queries, ) kazoo Python client tools for ZooKeeper Supporting Cast: JMeter run tests, generate reports collectd system monitoring Logstash4Solr log aggregation JConsole/VisualVM monitor JVM during indexing / queries

Fabric in 3 minutes or Less Fabric helps you do common system administration tasks on multiple hosts over SSH Just Python Easy to install / learn; good documentation http://docs.fabfile.org/en/1.8/ def kill(cluster):! ec2 = _connect_ec2()! taggedinstances = _find_instances_in_cluster(ec2, cluster)! instance_ids = taggedinstances.keys()! if confirm(('found %d instances to terminate, continue? '! % len(instance_ids))):! ec2.terminate_instances(instance_ids)! ec2.close()!

Fabric in 3 minutes or Less, cont. Define all commands in a file named: fabfile.py! Get a list of supported commands with short description $ fab -l! Available commands:! backup_to_s3 Backup an existing collection to S3! check_zk Performs health check against all! commit Sends a hard commit to the!! Get extended documentation for a command! $ fab -d new_solr_cloud! Displaying detailed information for task 'new_solrcloud :! Provisions n EC2 instances and then deploys SolrCloud; uses! the new_ec2_instances and setup_solrcloud commands!

Solr Scale Toolkit: Architecture Meta Node Solr-Scale-Toolkit deploy and manage SolrCloud cluster SolrCloud Nodes (NxM nodes) SiLK ZooKeeper Ensemble system monitoring of M machines w/ collectd and JMX Node 1: Custom AMI Solr Node 1: 8983 core core M of these machines ZK Host 1 ZK Host N Solr Node N: 898x ZooKeeper-1 ZooKeeper-N core core

Provisioning machines fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge Custom built AMI? Block device mapping dedicated disk per Solr node Launch and then poll status until they are live verify SSH connectivity Tag each instance with a cluster ID and username

ZooKeeper fab new_zk_ensemble:zk1,n=3 Two options: provision 1 to N nodes when you launch Solr cluster use existing named ensemble Fabric command simply creates the myid files and zoo.cfg file for the ensemble and some cron scripts for managing snapshots Basic health checking of ZooKeeper status: echo srvr nc localhost 2181

SolrCloud Cluster: NxM nodes EC2 Instance: RedHat Enterprise Linux, 64-bit Solr 4.7.1 Node 1 collection1 shard1 / replica1 (Solr core) Limit to 50-100M docs across all cores per node collection2 shard2 / replica1 (Solr core) Must design to give bulk of the memory to OS cache MMapDirectory OS cache memory mapped I/O dedicated disk 1 x M instances Solr 4.7.1 Node N collection3 shard1 / replica1 (Solr core) collection1 shard2 / replica1 (Solr core) MMapDirectory dedicated disk N

SolrCloud fab new_solrcloud:test1,zk=zk1,nodesperhost=2 Upload a BASH script that starts/stops Solr Set system props: jetty.port, host, zkhost, JVM opts One or more Solr nodes per machine JVM mem opts dependent on instance type and # of Solr nodes per instance Optionally configure log4j.properties to append messages to Rabbitmq for Logstash4Solr integration

solr-ctl.sh BASH script that implements: start/stop Solr nodes on each EC2 instance sets JVM memory options, system properties (jetty.port), enable remote JMX, etc backup log files before restarting nodes ensure JVM is killed correctly before restarting Environment variables in: solr-ctl-env.sh

Miscellaneous Utility Tasks Deploy a configuration directory to ZooKeeper Create a new collection Attach a local JConsole/VisualVM to a remote JVM Rolling restart (with Overseer awareness) Build Solr locally and patch remote Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network Put/get files Grep over all log files (across the cluster)

Other useful stuff fab mine: See clusters I m running (or for other users too) fab kill_mine: Terminate all instances I m running Use termination protection in production fab ssh_to: Quick way to SSH to one of the nodes in a cluster fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster fab jmeter: Execute a JMeter test plan against your cluster Example test plan and Java sampler is included with the source

SolrCloud Tools (SolrJ client app)./tools.sh tool healthcheck Java-based command-line application that uses SolrJ s CloudSolrServer to perform advanced cluster management operations: healthcheck: collect metadata and health information from all replicas for a collection from ZooKeeper backup: create a snapshot of each shard in a collection for backing up to remote storage (S3) Framework for building complex tools that benefit from having access to cluster state information in ZooKeeper

SiLK Integra6on SiLK: Solr integrated with Logstash and Kibana Index time-series data, such as log data (collectd, Solr logs, ) Build cool dashboards with Banana (fork of Kibana) Easily aggregate all WARN and more severe log messages from all Solr servers into logstash4solr Send collectd metrics to logstash4solr

SiLK Integra6on Solr Node 1: 8983 core many of these core Solr Node N: 8983 core core AMQP Log4J Appender MQ decouple log write performance from log indexing logstash4solr parsing/ indexing logstash4solr index banana Dashboard Ad hoc log analysis Log Records Include: - host:port - collection - shard - test label + standard Log4J message fields

What s Next? Migrate to using Apache libcloud instead of using boto directly Use this framework to perform large-scale performance testing Report results back to community Ability to request spot instances Good for testing only Chaos monkey tests integrate jepsen? Open source so please kick the tires!

Wrap- up LucidWorks: http://www.lucidworks.com SiLK: http://www.lucidworks.com/lucidworks-silk/ Solr In Action: http://www.manning.com/grainger/ Connect: @thelabdude / tim.potter@lucidworks.com Questions?