KnowOps: Towards an Embedded Knowledge Base for Management and Operations Xu Chen *, Yun Mao *, Z. Morley Mao +, Kobus Van der Merwe * * AT&T Labs Research + The University of Michigan Ann Arbor AT&T and the AT&T logo are trademarks of AT&T Knowledge Ventures.
What is Management? In short Keep the network in a healthy state Deliver SLA-compliant services for customers Not so short Planned maintenance/upgrade Fault management Configuration management Traffic/performance management Security management Page 2
Simplified View of Management Systems Ticketing System Event Manager Process Automation Inventory Event Correlation Configuration Management Instrumentation Interface Page 3
Simplified View of Management Systems Ticketing System Process Automation Event Correlation Knowledge Base Event Manager Vendor configuration example Inventory Provider service design documents Configuration Management - Captured in text-based documents - Require manual work to inject from documentation to management systems - Difficult to keep in-sync across systems Instrumentation Operational Interface Experience/Domain Knowledge Page 4
KnowOps: Using Shared and Machinereadable Knowledge Base Ticketing System Event Manager Process Automation Inventory Event Correlation Instrumentation Embedded Knowledge Base Configuration Management Interface Page 5
COOLAID [CoNEXT 2010] Views database COOLAID Rules Vendors Service Provider Capture domain knowledge in a declarative language Vendors: protocol mechanisms, dependencies Service providers: service realizations, misconfigurations Automated reasoning mechanisms decoupled from the rules Bottom-up reasoning Top-down reasoning Page 6
Shared Knowledge Base in KnowOps - What to do? - What should be avoided? - What events should be correlated? - What time windows should be used? Process Automation Event Correlation Service Requirements Vendor Rules Service Provider Rules - What to monitor? - What to alarm? Instrumentation Configuration Page 7
Preliminary Experience DROOLS: open-source business logic Rule engine Process automation Event correlation Optimization/Planning Page 8
Example VPLS MPLS, RSVP, OSPF P P PE BGP Session LSP Connection PE Site1 VPLS Connection Site2 Page 9
Example: VPLS VPLS LSP BGP MPLS RSVP OSPF Interface Configuration Page 10
Rules for OSPF Dependency on interface configuration Rule 1 Rule 2 Page 11
Planned Maintenance Example Automation Process Pre-maintenance check Disable related alarms Device Vendor Rules Service Provider Rules Disrupting VPLS service should raise a warning. Shut down interface Page 12
Conclusions Take-aways Knowledge transfer in current management systems are mostly text-based, thus costly and error-prone to build and maintain We should build management systems based on a machinereadable, shared, and embedded knowledge base Challenges What does the knowledge base really look like Better integrate different contributors Migrate from existing systems Future (on-going) work Drools-based implementation Application to mobility management tasks Page 13
Questions? Comments? Thanks! Page 14