Oracle and Streams Diagnostics and Monitoring

Size: px
Start display at page:

Download "Oracle and Streams Diagnostics and Monitoring"

Transcription

1 Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires Viegas

2 Agenda Oracle Enterprise Manager Streams monitoring tool Local monitoring tools Network diagnostic tools Open questions

3 Oracle Enterprise Manager Overview Set of centralized management tools administration configuration management end-to-end monitoring security capabilities i Proactive monitoring and alerting Monitoring service performance and usage Automation, schedule jobs, patch management

4 Oracle Enterprise Manager

5 Oracle Enterprise Manager

6 Oracle Enterprise Manager

7 Oracle Enterprise Manager Open questions Thresholds configuration Metrics for the servers load Run some advisors to try and pinpoint performance or configuration issues Can Tier1 use CERN OEM to monitor their databases?

8 Streams monitoring tool Overview Objectives: Replication topology Status of streams connections Error notifications Monitor streams performance (latency, throughput, ) Monitor resources related to the streams performance (Streams Pool memory, Redo generation) Architecture: Strmmon daemon written in Phython collects streams and instances info + repository errors and warnings End-user web application

9 Streams monitoring tool Monitor view Connection view

10 Streams monitoring tool Database list

11 Streams monitoring tool Connection dashboard view Detailed Streams view

12 Streams monitoring tool Graph generator

13 Streams monitoring tool New features Error tab (web application) list of errors that have been reported by streams processes Availability tab (web application) Percentage availability of each instance provided with availability plots. New metrics (monitor) CPU consumption Physical bytes Read Written

14 Streams monitoring tool Errors List CPU consumption

15 Streams monitoring tool Availability

16 Streams monitoring tool Open questions Proposition of future features Weekly reports(number of transactions applied, number of LCRs streamed etc)? More notifications via mail(high latency,high CPU utilization etc.)? Some automatization in streams administration? Detecting common failures (e.g. propagation hangs) Proceed procedure to solve the failures Streams errors report: Any action necessary at Tier1? Who is testing what? alerts RAL still does not receive notifications

17 Local and Network monitoring Triumf (slides) BNL (slides) Open questions Is OEM sufficient? Which h other tools? To which metrics we should pay attention? Homemade tools for backup monitoring: RAL, Local monitoring with Nagios Is this reasonable? Any experience?

18 Overall What, specifically, Tier-1s should monitor on their own databases? What CERN want to know about the sites? What Tier-1 sites need to know about the CERN databases? What Tier-1 sites need to know about other Tier-1 sites?