Real life experience implementing mnitring & services management Andreas Tsangaris, CTO, PERFORMANCE
The reality f tday s IT envirnment N central view int what s really ging n in IT Experts have their wn tls Help desk has limited visibility int IT issues? IT cannt explain utages and are nt reacting fast enugh fr the business?? IT is challenged with: Multiple event cnsles presenting different dmain level infrmatin Finger pinting when issues happen T many duplicate events distract frm the real issue Rising cst f incidents 2
OTE Mnitring Prject Strategy: Create a single platfrm t cnslidate all mnitring mechanisms, targeted t deliver Business Service mnitring. Mnitring Platfrm: HP Operatins Manager (OMi) Cmpnents chsen: BSM, TBEG, OML(existing), Site Scpe, BPM, SCOM (existing) Nagis (existing) Started: May 2012 - Cmpleted: Nvember 2012 3
Centralized cnslidated view int IT issues OMi - the slutin t slve tday s cmplex envirnments Applicatins Infrastructure OMi as the peratins bridge Single place t cnslidate events frm varius surces Events crrelated and rt-cause determined Autmatically mapped t CIs representing services/apps Autmatically priritized based n business impact Enriched with extra state infrmatin and expert advice Autmatically fixed r assigned t expert grups Experts triage and analyze remaining prblems using cntext sensitive tls/perfrmance graphing Built-in cllabratin capabilities fr fast turnarund Dashbards t prvide summaries at a glance Remediate via links t Remedy and Instructins *ITIL V3 definitin: Operatins Bridge : A physical lcatin where IT Services and IT Infrastructure are mnitred and managed. 4
A universal event crrelatin engine OMi btains data frm varius event/data cllectrs Relate autmatically t a cmplete mdel f the IT infrastructure BMC Remedy OMi BSM platfrm Run-time service mdel BSM cnnectrs fr SCOM, IBM, Nagis Open BSM cnnectr interfaces HP cnnectrs NNMi Micrsft SCOM NAGIOS 3 rd party dman mgrs APM SiteScpe OM/PM Cmplete crss-dmain visibility f IT infrastructure issues 5
Mnitring Slutin Architecture 6
Integratin f HP BSM with HP OM Infrmatin/metrics/health indicatrs sent frm HP OM t the HP BSM Platfrm and the returning events t HP OM, in case f autmatic r ad-hc actins frm the OMi Cnsle within BSM: 7
Integratin f HP BSM with HP Sitescpe integratin f HP SiteScpe, HP OM, HP BSM, HP Perfrmance Manager and HP Diagnstics 8
Agentless Agents Agentless Agents 9
Single cnslidated event dashbard Pure event driven status t cmplement BSM service health Fast time-t-value Wrks purely n events N need t map events t health indicatrs Simple peratins Integrated int BSM mesh-up Drill-dwn t Event Cnsle Simple status aggregatin via filters and views Mdern user interface Optimized usage f available real estate Custm pictures fr quick identificatin Flexible layut capabilities 10
Custmizable peratr cnsle 11
Simple, fast, autmated event management Several crrelatin innvatins take the wrk ut f event handling...spend less time handling events and mre time innvating and fixing issues Tplgy based event crrelatin (TBEC) Stream based event crrelatin Event suppressin Autmatic event strm detectin Autmatic duplicate suppressin Autmatic clsing f related events based n key patterns and health indicatrs 12
Unique tplgy based event crrelatin (TBEC) Efficiency gains with advanced event causal crrelatin Cause Cause and symptm Symptm Use case addressed by TBEC: 1. Smething ges wrng in yur envirnment 2. Mnitring reprts multiple prblems via events 3. Usually just ne f the events describes the cause f the prblem 4. Others are just symptms 5. Fix the cause and the symptms g away TBEC dynamically adapts based n discvered data frm RTSM 13
Single view fr faster investigatin f perfrmance issues Additinal crss-bsm dmain perfrmance data OOTB graphs based n CI type frm a hlistic view Metrics frm: OM PA SiteScpe SPIs RUM BPM Diagnstics 14
Business Service Management and HP OMi Achieve peratinal excellence in a dynamic wrld Applicatin perfrmance management End-user experience Transactins Applicatin diagnstics Run-time service mdel Cmprehensive, autmated and up-t-date mdel fr dynamic services Infrastructure management Server Netwrk Virtualizatin 3rd party mnitring tls Universal event crrelatin and Service intelligence Reduce MTTR Reduce OpEx Imprve health f business services 15
Benefits Reduced OpEx, Faster MTTR, perate in hetergeneus envirnments 24x7 visibility int business service availability Simplified single cnsle t cnslidate all events Universal event crrelatin based n unique innvatins Autmatically priritize based n business impact Enrich with extra state infrmatin and expert advice Autmatically fix r assign t expert Built-in cllabratin fr fast turn arund Remediate via links t service desk tls and runbks Easily launch int cntext sensitive tls/perfrmance graphs fr quick detailed analysis Autmated event management, GUI, strm detectins, advanced tplgy based event 16 crrelatin and stream based event crrelatin
Managed servers and plicies Managed Servers / OS 326 Plicies in Ttal 100 Plicies per Type 13 : Database 23 : AIX O/S 50 Database 4% 7% 22 : Slaris O/S 0 AIX HPUX LINUX SOLARIS Ttal 136 Servers in Ttal 10 : AIX 27 : HPUX AIX O/S Slaris O/S Linux O/S HPUX O/S SNMP 16% 24% 2% 17% 7% 7% 12% 1% 3% 24 : Linux O/S 38 : HPUX O/S 4 : SNMP 10 : JBOSS 55 : SIEBEL SPI 7 : SIEBEL Custm 61 : LINUX 77 : Weblgic SPI 6 : SOLARIS 53 : Custm 38 : Virtual Ndes 17
Sitescpe Mnitrs Mnitr Type Number CPU Utilizatin 46 Oracle Health Metrics 27 DB Tablespace Checks 43 DB Health Checks 43 DB Custm Checks 257 DB Links Health Checks 30 Disk Space 64 IIS Server Health 12 Lg File 20 Memry Utilizatin 43 Ping 44 Prt 20 Unix Prcesses 29 URL Mnitrs 299 Web Services (URL Cntent) 93 Windws Prcesses 18 Ttal 1088 Mnitrs per Type 2% 2% 4% 9% 4% 4% 27% 24% 3% 4% 6% 4% 3% 2% 1% 2% CPU Utilizatin Oracle Health Metrics DB Tablespace Checks DB Health Checks DB Custm Checks DB Links Health Checks Disk Space IIS Server Health Lg File Memry Utilizatin Ping Prt 18
BPM Scenaris Cnfigured (Synthetic transactins fr mnitring, Availability and Perfrmance metrics fr the list f applicatins belw) 19
Applicatins Mnitring 35 Applicatins 20
Service Flw Mnitring 15 Service Flws 21
THANK YOU!