K1000: Advanced Topics Tyler Gingrich Senior Engineering Manager, K1000 Craig Thatcher, Software Engineer, K1000
Topics Konductor Scripting Managed Installs Munin 2 1/23/13
Konductor Background process on the K1000 Job is to keep the K1000 busy without allowing it to be too busy What are tasks? Records in the KBSYS.KONDUCTOR_TASK table Inventory, scripting update, krash upload, scripts, patches, etc. What constitutes busy? Load Average > Load threshold Many apache tasks launched How often is periodically? 1-30 seconds varies according to the load How many tasks get launched? Varies according to number of CPUs configured and apache tasks 3 1/23/13
Konductor Operations Wake Up Monitor load level / apache task count à if busy sleep (1-30 secs) Query all KONDUCTOR_TASKS for connected machines and return the results ordered by priority and overdueness. Return no more than current TPL tasks to konductor. Send AMP tasks to specified agents no more than one per agent. Adjust sleep duration and sleep (1-30 secs) 4 1/23/13
Konductor Load Balancing Balancing what, how, and why? CPU Load, measure average load using load_avg() system call Physical Memory, measure number of apache instances as a proxy What settings can you change? Sets the CPU load level for preventing new task launches Is a multiplier in the TPL value Apache instance limit varies by machine. 250 for a physical K1200. 5 1/23/13
Konductor Log How can I see what Konductor is up to? K1000->Settings->Logs->Konductor Log Useful Columns lt: level threshold affected by # of cpus, and threshold setting lv: current load level lv > lt = busy t: tasks number of tasks launched since konductor started sl: sleep how many seconds konductor sleeps each cycle tpl: tasks per loop maximum number of tasks that can be launched apa: apache instances number of apache processes currently running [2011-10-05 15:13:30-0400] stats [s:5463 t/s:5 t/tc:236 t:30947 tc:131 c:48446 cc:163 sl:1 sc:1319 tpl:250 lt:5 lv:1.8193] [2011-10-05 15:13:41-0400] stats [s:5474 t/s:5 t/tc:236 t:30947 tc:131 c:49082 cc:164 sl:3 sc:1320 tpl:250 lt:5 lv:1.8193] [2011-10-05 15:14:07-0400] stats [s:5500 t/s:5 t/tc:236 t:31168 tc:132 c:49802 cc:165 sl:2 sc:1323 tpl:250 lt:5 lv:1.8979] [2011-10-05 15:14:27-0400] stats [s:5520 t/s:5 t/tc:236 t:31394 tc:133 c:50539 cc:166 sl:1 sc:1325 tpl:250 lt:5 lv:1.7158] [2011-10-05 15:14:43-0400] stats [s:5536 t/s:5 t/tc:236 t:31624 tc:134 c:51221 cc:167 sl:1 sc:1326 tpl:250 lt:5 lv:1.5566] When should I worry? Extended period of time with high lv (above lt) Very, very high lv (> 100) Apache tasks pegged (around 100 or 150 depending on CPU count) 6 1/23/13
Scripting K1000 Scheduling Mechanisms Internal (Inventory, Scripting Update, Krash Upload) Server (Online) Agent (Offline) Konductor is the primary Load Balancer KEY POINT: use ONLINE scripts if possible. Can be load balanced Far less likely to overwhelm your K1 7 1/23/13
Scripting: Please Avoid or Take Care File uploads Be aware of disk space Executing as Offline script Calling of system scripts Modifying system scripts Deploy to All Machines (Offline) 8 1/23/13
Online Vs. Offline Scripting Online scripting Initiated via konductor using server scheduling Load balanced Run as capable Scripts executed as System by default (Windows) Alerting Offline scripting Initiated via agent scheduling Execute at bootup Execute at login Execute while disconnected Execute once after next check in Not load balanced 9 1/23/13
Offline Scripts Frequency of Offline script updates is set via the Scripting Update Interval You can manually execute a scripting update Runkbot.exe 3 0 Offline Scripting logs upload Assigned list of scripts updated to kbots.xml You can see results for a single machine in your browser /service/kbot_service_notsoap.php? METHOD=getkbotconfig&KUID=9D856089-ED8B-451C-A21A- A458AF457469 10 1/23/13
Offline Scripts /service/kbot_service_notsoap.php? METHOD=getkbotconfig&KUID=9D856089-ED8B-451C-A21A- A458AF457469 The KUID is passed to get unique results for a single machine KUID is available from the computer inventory detail page 11 1/23/13
Offline Scripts Results are parsed and individual scripts are downloaded with dependencies /service/kbot_service_notsoap.php? METHOD=getkbot&KUID=9D856089-ED8B-451C-A21A- A458AF457469&KBOT_ID=100&KBOT_VERSION=1355180694r1 12 1/23/13
Munin Performance monitoring graphs K1000 Settings->Logs->System Performance Data points every 5 minutes Most recent day on the left Most recent week on the right Learn what YOUR graphs look like Monitor for big or unusual changes 13 1/23/13
Call Support 14 1/23/13
Questions 15 1/23/13