CONSUL AS A MONITORING SERVICE
SETH VARGO @sethvargo
SERVICE ORIENTED ARCHITECTURE
SOA PRIMER Autonomous Limited Scope Loose Coupling
ORDER PROCESSING ORDER WEB APP HISTORY FORECASTING
ORDER PROCESSING WEB APP DISCOVERY Which nodes are part of "order processing"?
ORDER PROCESSING NODE 1 WEB APP NODE 2 NODE N LOAD BALANCING How to ensure request leveling across providers?
ORDER PROCESSING NODE 1 WEB APP NODE 2 LOAD BALANCER NODE N ANTI- PATTERN Load Balancer is a Single Point of Failure (SPOF)
ORDER PROCESSING NODE 1 WEB APP NODE 2 LOAD BALANCER NODE 3 HEALTH CHECKING How to avoid routing to unhealthy hosts?
maintenance: false feature_a: true role: "web" WEB 1 WEB APP WEB 2 WEB N CONFIGURATION How to efficiently push dynamic configuration?
4 BASIC PROBLEMS SERVICE LOAD HEALTH KEY- VALUE DISCOVERY BALANCING CHECKING CONFIGURATION
EXISTING "SOLUTIONS" ZOOKEEPER ETCD SENSU SMART STACK http://consul.io/intro/vs
CONSUL
Service Discovery HTTP + DNS
demo master dig web-frontend.service.consul
demo master dig web-frontend.service.consul ; <<>> DiG 9.8.3-P1 <<>> web-frontend.service.consul. ANY ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29981 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;web-frontend.service.consul. IN ANY ;; ANSWER SECTION: web-frontend.service.consul. 0 IN A 10.0.3.83 web-frontend.service.consul. 0 IN A 10.0.1.109
Datacenter Aware
CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT RPC LAN GOSSIP RPC SERVER SERVER SERVER REPLICATION REPLICATION
CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT RPC LAN GOSSIP RPC SERVER SERVER SERVER REPLICATION REPLICATION SERVER REPLICATION SERVER REPLICATION SERVER
CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT RPC LAN GOSSIP RPC SERVER SERVER SERVER REPLICATION REPLICATION WAN GOSSIP SERVER REPLICATION SERVER REPLICATION SERVER
Host & Service Level Health Checks
demo master consul-template -template="example.ctmpl" -dry > listen http-in bind *:8000 server web-0 127.0.0.1:80 server web-1 127.0.0.1:80 server web-2 127.0.0.1:80 demo master
demo master consul-template -template="example.ctmpl" -dry > listen http-in bind *:8000 server web-0 127.0.0.1:80 server web-1 127.0.0.1:80 server web-2 127.0.0.1:80 demo master sudo stop webserver
demo master consul-template -template="example.ctmpl" -dry > listen http-in bind *:8000 server web-1 127.0.0.1:80 server web-2 127.0.0.1:80 demo master sudo stop webserver
demo master consul-template -template="example.ctmpl" -dry > listen http-in bind *:8000 server web-0 127.0.0.1:80 server web-1 127.0.0.1:80 server web-2 127.0.0.1:80 demo master sudo start webserver
K/V Store HTTP API
demo master curl -X PUT -d 'bar' http://localhost:8500/v1/kv/foo true
demo master curl -X PUT -d 'bar' http://localhost:8500/v1/kv/foo true demo master curl http://localhost:8500/v1/kv/foo [ ] { } "CreateIndex": 100, "ModifyIndex": 200, "Key": "foo", "Flags": 0, "Value": "YmFy"
TRUSTED BY
HEALTH CHECKS
WHAT IS A CHECK? Any command that returns an exit code
WHAT IS A CHECK? Any command that returns an exit code 0 1 PASSING WARNING FAILING
WHAT IS A CHECK? Output is captured as a "note" for inspection $ curl http://127.0.0.1:4455/_health curl: (7) Failed to connect to 127.0.0.1 port 4455: Connection refused
CREATING A CHECK Use a custom script { } "check": { "id": "mem-util", "name": "Memory utilization", "script": "/usr/local/bin/check_mem.py", "interval": "10s" }
CREATING A CHECK Use a built-in check type { } "check": { "id": "api", "name": "HTTP API on port 4455", "http": "http://localhost:4455/_health", "interval": "10s", "timeout": "1s" }
RESPONSIVE
WEB 1 MONITORING WEB 2 SERVICE WEB N TRADITIONAL MONITORING Pushes information into a silo
WEB 1 MONITORING WEB 2 SERVICE WEB N TRADITIONAL MONITORING Pushes information into a silo
WEB 1 MONITORING SERVICE WEB 2 WEB N TRADITIONAL MONITORING Pushes information into a silo
WEB 1 MONITORING SERVICE WEB 2 WEB N TRADITIONAL MONITORING Pushes information into a silo
U WEB 1 MONITORING SERVICE WEB 2 WEB N TRADITIONAL MONITORING Pushes information into a silo
U F WEB 1 MONITORING SERVICE WEB 2 WEB N TRADITIONAL MONITORING Pushes information into a silo
U F WEB 1 MONITORING SERVICE WEB 2 WEB N TRADITIONAL MONITORING Pushes information into a silo
WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
dig web.service.consul 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
host: web.service.consul dig web.service.consul 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
host: web.service.consul dig web.service.consul 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
host: web.service.consul dig web.service.consul 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
host: web.service.consul dig web.service.consul 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
host: web.service.consul dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6 WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Removes unhealthy nodes from service discovery layer
CONSUL MONITORING Unhealthy nodes are not returned from DNS queries dig web.service.consul web-01, web-02, web-03
CONSUL MONITORING Unhealthy nodes are not returned from HTTP API curl /v1/services/web web-01, web-02, web-03
LOCKING
CONSUL LOCK Allows for a new kind of "HA" demo master consul lock [options] prefix child...
VAULT 1 VAULT 2 CONSUL VAULT 3 CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL VAULT 3 L CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL VAULT 3 CONSUL LOCK Making standby HA much simpler
VAULT 1 L LEADER ELECTION VAULT 2 CONSUL VAULT 3 CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL GET /secret/foo VAULT 3 REQUEST CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL GET /secret/foo VAULT 3 REQUEST CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL GET /secret/foo VAULT 3 REQUEST CONSUL LOCK Making standby HA much simpler
VAULT 1 L VAULT 2 CONSUL VAULT 3 CONSUL LOCK Making standby HA much simpler
VAULT 1 l VAULT 2 CONSUL VAULT 3 CONSUL LOCK Making standby HA much simpler
VAULT 1 VAULT 2 CONSUL VAULT 3 L CONSUL LOCK Making standby HA much simpler
VAULT 1 VAULT 2 CONSUL VAULT 3 L CONSUL LOCK Making standby HA much simpler
VAULT 1 VAULT 2 CONSUL GET /secret/foo VAULT 3 L REQUEST CONSUL LOCK Making standby HA much simpler
VAULT 1 VAULT 2 CONSUL GET /secret/foo VAULT 3 L REQUEST CONSUL LOCK Making standby HA much simpler
CONSUL LOCK Solves the "exactly one of these must always be running" problem
CONSUL LOCK Also great as a semaphore - rolling restarts
SCALABILITY
WEB 1 "I'm healthy" MONITORING WEB 2 SERVICE "Good, thanks for asking!" WEB N TRADITIONAL MONITORING Notifies/polls all statuses
WEB 1 MONITORING SERVICE 1,000'S OF REQUESTS WEB 2 WEB 1,000 TRADITIONAL MONITORING Notifies/polls all statuses
HA WEB 1 MONITORING SERVICE 1,000'S OF REQUESTS WEB 2 WEB 1,000 TRADITIONAL MONITORING Notifies/polls all statuses
My status has changed WEB 1 CONSUL WEB 2 WEB N CONSUL MONITORING Notifies on status changes
WEB 1 CONSUL 10'S OF REQUESTS WEB 2 WEB 1,000 CONSUL MONITORING Notifies on status changes
CONCLUSION
SOLVES 4 BASIC PROBLEMS SERVICE LOAD HEALTH KEY- VALUE DISCOVERY BALANCING CHECKING CONFIGURATION
SOLVES 4 MORE PROBLEMS G L H 9 LEADER ELECTION SEMAPHORE LOCKING RESPONSIVE SCALABLE
SETH VARGO @sethvargo QUESTIONS?