<Insert Picture Here> Oracle BI 11g Diagnostics Oracle BI 11.1.1.6.0 Adam Bloom Oracle BI Product Manager
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle.
Agenda Overview of Diagnostics and Logging Functionality Troubleshooting an issue (using security as an example) <Insert Picture Here>
Overview of Diagnostics and Logging Functionality BI Fusion Middleware Weblogic
The BI Logging and Diagnostics Landscape FMW Logging and Diagnostics BI Logging and Diagnostics DMS Dynamic Monitoring Service DFW - Diagnostic Framework ECID Execution Context ID ODL - Diagnostic Logging BI Logging RDA Remote Diagnostic Agent Weblogic Logging and Diagnostics WLDF Weblogic Diagnostic Framework Weblogic Logging
The BI Logging and Diagnostics Landscape FMW Logging and Diagnostics BI Logging and Diagnostics DMS Dynamic Monitoring Service DFW - Diagnostic Framework ECID Execution Context ID ODL - Diagnostic Logging BI Logging RDA Remote Diagnostic Agent Weblogic Logging and Diagnostics WLDF Weblogic Diagnostic Framework Weblogic Logging
BI-Specific Logging and Diagnostics Book: BI System Administrator's Guide Chapter 8 Section: Diagnosing and Resolving Issues in Oracle Business Intelligence http://docs.oracle.com/cd/e23943_01/bi.1111/e10541/logging.htm#i1037454 Lists various tools you can use with doc links Explains log files and log configuration settings for those logs. Specifics on BI Server Query log and a Log Viewer utility Specifcs on OBIPS logging configuration Information on using odbc/jdbc for looking at the BI Server
The BI Logging and Diagnostics Landscape FMW Logging and Diagnostics BI Logging and Diagnostics DMS Dynamic Monitoring Service DFW - Diagnostic Framework ECID Execution Context ID ODL - Diagnostic Logging BI Logging RDA Remote Diagnostic Agent Weblogic Logging and Diagnostics WLDF Weblogic Diagnostic Framework Weblogic Logging
WebLogic Logging and Diagnostics (in brief) Book: Configuring Log Files and Filtering Log Messages for Oracle WebLogic Server Section: Understanding WebLogic Logging Services http://docs.oracle.com/cd/e23943_01/web.1111/e13739/logging_services.htm#i1172535 Explains how WLS logging works and the differences between the various WLS logs
Weblogic Diagnostic Framework Book: Configuring and Using the Diagnostics Framework for Oracle WebLogic Server http://docs.oracle.com/cd/e23943_01/web.1111/e13714/intro.htm#sthref6 WLDF provides features for monitoring and diagnosing problems in running WebLogic Server instances and clusters and in applications deployed to them. Integration with Oracle Jrockit Diagnostic Image Capture Captures and persists data events, log records, and metrics from server instances and applications. Instrumentation Adds diagnostic code to WebLogic Server instances and the applications running on them to execute diagnostic actions at specified locations in the code. Harvester Captures metrics from run-time MBeans, including WebLogic Server MBeans and custom MBeans, which can be archived and later accessed for viewing historical data. Watches and Notifications Monitoring Dashboard Logging services Manage logs
Weblogic Diagnostic Framework The WebLogic Diagnostics Framework (WLDF) provides a diagnostic feature that allows MBean attributes to be "harvested" and monitored for specific conditions. This provides a pro-active way of monitoring activity in your environment allowing you to setup notifications (i.e. email, JMX notification) when the condition is triggered. DMS exposes DMS data using MBeans and so we can take advantage of this feature. For example, we could configure WLDF to create a DFW incident (you could also setup an email notification) whenever the StarWarsAlliance "forcebalance" metric turns to "BAD" indicating that there is more bad than good in the galaxy...
The BI Logging and Diagnostics Landscape FMW Logging and Diagnostics BI Logging and Diagnostics DMS Dynamic Monitoring Service DFW - Diagnostic Framework ECID Execution Context ID ODL - Diagnostic Logging BI Logging RDA Remote Diagnostic Agent Weblogic Logging and Diagnostics WLDF Weblogic Diagnostic Framework Weblogic Logging
Fusion Middleware Logging and Diagnostics Book: Fusion Middleware Administrator's Guide Chapter 12 Section: Managing Log Files and Diagnostic Data http://docs.oracle.com/cd/e23943_01/core.1111/e10105/logs.htm#chddeebf ODL Diagnostic Logging Execution Context (ECID) DFW Diagnostic Framework DMS Dynamic Monitoring Service
ODL - Background Inconsistent Oracle Diagnostic Logs 9i and 10g buffet of log formats, locations, contents, timestamps, config methods, command line tools, rotation schemes, file names, tags, etc. Oracle s merger and acquisition strategy adds complexity 11g Goal: achieve consistent mechanics
ODL: Oracle Diagnostic Logging 1. OJDL: plugin for java.util.logging and log4j. emits standard format and rotates logs 2. JMX MBeans and wlst commands for config and query of logs 3. EM UI for config and query of logs 4. Support for C components as well C components emit their own ODL logs with rotation and ECIDs C component logs queryable
ODL Features EM UI and wlst command line tool wlst listlogs, displaylogs, getloglevel, setloglevel, listloggers, listloghandlers, configureloghandler Search by message type, by ECID, by recent messages or within a time range Summarize by message type or component ID Constrain by server, by application Configure Log location Log level Log rotation with time or size threshold
Consistent Format and Location [2008-09-28T23:22:52.388-07:00] [server_soa] [NOTIFICATION] [SOA-3453] [oracle.soa.mediator.service.common.functions] [tid: 11] [ecid: 0000HmgkkARF4E8Jviu1V118s7Cu00000 1,0] [APP: soa-infra] Successful SOA Initialization [tstamp] [component id] [type:level] [mesg-id] [module id] ([field-name: field-value])* message-text Log location: domain_name/servers/server_name/logs/component_name-diagnostic.log
Why develop the ExecutionContext feature? A single request to a system may involve many components on many servers. WS J2EE OHS ExecutionContext facilitates the passing of context information between the components: DB Allows diagnostic information from components to be correlated Allows values to be passed between components
Some challenges remain Hundreds of individually configured loggers Need debugging strategies and processes For weblogic we don't have full integration yet. We can search weblogic logs, but it has separate log files, configuration and log format.
FMW Diagnostic Framework DFW is a diagnostic feature that detects critical failures and captures relevant diagnostics (logs, metrics, WLS server image, dumps and more) into "Incidents". An Incident represents a single occurrence of critical problem, usually a problem that requires interaction and data sharing between the customer and Oracle Support. DFW reduces the time for issue resolution by automatically capturing targeted diagnostics at the time of failure. You can also create incidents manually. Incident data can be analyzed onsite by. Customers can also check for related issues on support.oracle.com or create Service Requests and upload Incident data to Oracle Support.
Metrics and Monitoring in Brief DMS
Background DMS basics DMS has been used to measure metrics in Oracle AS since the old OAS version of the product Similar to V$ views in RDBMS: summary data, always on Measures a wide variety of events consistently with low cost Data summaries available via EM, command line tools, Servlets, MBeans, etc. 11g Achievements: Monitoring of distributed Weblogic domains JMX integration (multiple ways) Integration with the new OPMN (for non-jee components) Most parts of 11g FMW are now instrumented with DMS A snapshot of DMS data is captured with RDA
DMS: Dynamic Monitoring Service In each monitored server: FMW developers use DMS Java and C APIs to instrument individual components. DMS code within each server computes individual metrics and keeps the values in server memory. A DMS Spy (Servlet) in each server exports the metrics. In the central Weblogic Admin Server: DMS code discovers and collects metrics from distributed domain of servers. (via JMX for JEE servers, via OPMN for non-jee) DMS computes additional, aggregate metric values Summaries across distributed servers (e.g., aggregate throughput) Summaries over time (e.g., rates) DMS MBean (JMX) provides access to the metrics Admin Clients access the DMS MBeans: wlst, EM, generic JMX clts
Monitoring BI Performance Metrics BI System Administrator's Guide Chapter 7 Using Fusion Middleware Control to View Common Performance Metrics http://docs.oracle.com/cd/e23943_01/bi.1111/e10541/ querycaching.htm#bcgjibjd Using EM console for looking at metrics
Capturing Metrics Using the Oracle BI Systems Management API System Administrator's Guide for Oracle Business Intelligence Enterprise Edition Capturing Metrics Using the Oracle BI Systems Management API http://docs.oracle.com/cd/e23943_01/bi.1111/e10541/ admin_api.htm#cdejjaah In addition to the Metrics Browser in Fusion Middleware Control, you can view metrics for Oracle Business Intelligence using the Dynamic Monitoring Service (DMS) and WLST commands. This section describes how to use these methods. This is Mike s Python script for getting metrics into a spreadsheet.
Troubleshooting an issue (using security as an example)
Architecture Overview WebLogic BI Publisher OPSS Identity Store LDAP: WLS, OID, AD etc. Analytics Security Service OWSM Policy Store Credential Store LDAP (OID) or File-based. MDS OBIPS OBIS Scheduler System User Connection
The most Important Config Files and Logs Component Config Location Managed by Log File BI Server BIInstance/configuration/ OracleBIServerComponent/ coreapplication_obis1/ NQSConfig.INI EM and manual BIInstance/diagnostics/logs/ OracleBIServerComponent/ coreapplication_obis1/ nqserver.log BI Presentation Services BIInstance/configuration/ OracleBIPresentationServicesCo mponent/ coreapplication_obips1/ instanceconfig.xml EM and manual BIInstance/diagnostics/logs/ OracleBIPresentationServicesComp onent/coreapplication_obips1/ sawlog0.log BI Security Web Service Managed Server (Weblogic) Security Providers N/A EM BIDomain/servers/bi_server1/ logs/bi_server1-diagnostic.log BIDomain/config/config.xml Weblogic Console BIDomain/servers/bi_server1/ logs/bi_server1.log BIDomain/servers/bi_server1/ logs/bi_server1-diagnostic.log BIDomain/servers/bi_server1/ logs/bi_server1.out OPSS Security Providers BIDomain/config/fmwconfig/jpsconfig.xml EM BIDomain/servers/bi_server1/ logs/bi_server1.log BIDomain/servers/bi_server1/ logs/bi_server1-diagnostic.log BIDomain/servers/bi_server1/ logs/bi_server1.out
Troubleshooting, Diagnostics Process When (e.g.) a User is Unable to Login: Understand the end-to-end scenario including the expected behavior Work through the Causes of User cannot Login with reference to the BI Security Troubleshooting guide to isolate the problem Check and fix any configuration issues For issues that remain: Enable additional logging as appropriate Search logs for errors with the aim of identifying the two components between which the error is occurring
Causes of User cannot Login Part 1 Check the correct, BI certified Authenticator is configured for the Identity Store Authenticator Mis-configured Only one User Affected Check users are visible in Weblogic Console Check Groups are visible in Weblogic Console Check a user with appropriate permissions can login to the Weblogic console Wrong credentials Account locked or expired Continued on next slide Check the ordering and control flags on Authenticators Check Account used for LDAP connection has sufficient privileges Ensure user and group Base DNs are correct Ensure from Name Filter queries are correct Ensure the attributes specified (including User GUID) match what is in your LDAP store WebLogic Admin user moved to LDAP and cannot boot WebLogic Check Weblogic has been re-started after any config changes Authenticator Mis-configured (second-level issues) UserID has changed but GUID has not Check all JEE applications are running Check all BI System processes are running Check Identity Store is available Communication failure Unable to Login
Causes of User cannot Login Part 2 BI System User authentication Failure Check BI System User account exists and has correct roles. Check BI System User is in sync with credential store Check BI System User account in underlying identity store Ensure Embedded WebLogic LDAP replication of BI System User credential change has not failed BI 11.1.1.3 - the Authentication provider which refer to the BI user population including the BI System User must be set first of the highest strength control flag in the order of providers Database error issues connecting to the MDS- OWSM schema created on install Oracle Web Services Manager errors Issues with the OracleSystemUser account OWSM uses to access it s resources Still Unable to Login BI 11.1.1.5 and later as above, or virtualize = true must be set If the attributes specified for username or guid have been set to something other than the default for the Weblogic Authenticator, ensure the OPSS configuration matches. If using a SQL Authenticator, make sure the Adapters are configured correctly BI 11.1.1.5 and later if virtualize = true and the underlying Identity Store requires SSL, check libovd is configured correctly Check Weblogic has been restarted after any config changes Identity Store Provider (OPSS) mis-configured
Introducing the BI Security Diagnostics Helper The Oracle BI Security Diagnostics Helper is a JEE application that helps diagnose possible configuration issues which may prevent your users from being able to log in to your Oracle BI system.
Creating a Clean Set of Logs Process Doc ID 1434514.1 Set appropriate log levels, stop all processes, clear logs then re-start and re-produce the error Collect config.xml and jps-config.xml and adapters.os_xml (if using multiple Authenticators i.e. virtualize=true) Work through the logs in the following order: OBIPS (saw.log) OBIS (NQServer.log) Managed Server.out (Admin Server if using a Simple Install from 11.1.1.5) Managed Server.log (Admin Server if using a Simple Install from 11.1.1.5) Managed Server diagnostic log (Admin Server if using a Simple Install from 11.1.1.5) Domain and Admin Server logs for MDS errors Identify error with timestamps and cross-reference between log files (maybe via EM ECID) Identify the root cause error
Turning on Additional Logging for Security Suggested additional Loggers to use: oracle.bi.security The BI Security Service Logger oracle.ods for tracing when virtualize=true (the logger is only registered after setting virtualize=true) oracle.idm.userroleapi The logger for User Role API interaction (needs to be manually enabled) Other security-related loggers that may be of use: oracle.jps.common - The OPSS Logger for generic errors oracle.jps.* oracle.security.jps.* oracle.wsm.* - The OWSM Logger
Turning on unlisted loggers
Q & A
Turning on logging
Log Configuration BI Security Service
Log Configuration BI Security Service
Additional 11.1.1.5.0 Logging Options Fusion Middleware Control's Selective Tracing feature can be used at a User level in 11.1.1.5.0 This may require additional settings in DOMAIN_HOME/bin/ setdomainenv.sh JAVA_OPTIONS= - Djava.util.logging.manager=or acle.core.ojdl.logging.odllog Manager ${JAVA_OPTIONS} export JAVA_OPTIONS FMWCONFIG_CLASSPATH= $ {FMWCONFIG_CLASSPATH} ${CLASSPATHSEP}$ {ORACLE_COMMON_HOME} /modules/oracle.odl_11.1.1/ ojdl.jar export FMWCONFIG_CLASSPATH
Selective Tracing http://fmwdocs.us.oracle.com/doclibs/fmw/ E10285_01/core.1111/e10105/logs.htm#i1021621 Sometimes you need more information to troubleshoot a problem than it is usually recorded in the logs. One way to achieve that is to increase the level of messages logged by one or more components. For example, you can set the logging level to TRACE:1 or TRACE:32, as described in Section 12.4.3, which results in more detailed messages being written to the log files. This is referred to as tracing
Diagnostics Spy http://asengwiki.us.oracle.com/asengwiki/ display/fmwdiag/diagnostics +Spy
ODL and Weblogic Logging http://aseng-wiki.us.oracle.com/ asengwiki/display/fmwdiag/integration +of+weblogic+and+odl+logging http://aseng-wiki.us.oracle.com/ asengwiki/display/fmwdiag/wldf+fmw +Diagnostic+Framework+Integration
Fusion Middleware Diagnostics Overview Our team provides portable frameworks used for diagnosability in Fusion Middleware. Key project areas include ODL (logging), DMS (performance metrics), Execution Context (ECID), and DFW (Diagnosability Framework). We also integrate FMW with related diagnostic frameworks and tools such as RDA (Support's Remote Diagnostic Agent framework), ADR (EM's Automated Diagnostic Repository), WLDF (WebLogic's Diagnostic Framework) and Enterprise Manager. Finally we create and administer diagnosability standards for the FMW Development Process.
What is an ExecutionContext - RID Relationship ID RID 0 0:1 0:1:1 0:1:3 0:1:2 The ECID is constant for Klw23k 2qL this piece of work: 0:1 0 But the RID is The set of all 0:1:1 unique for ExecutionContexts sharing the 0:1:2 0:1:3 each sub-task same ECID is called a Family
What is an ExecutionContext - The Map(s) Global map of values + + Local map of values
Creating an Incident Manually http://fmwdocs.us.oracle.com/doclibs/fmw/e10285_01/core.1111/e10105/ diagnostics.htm#beiccegd 13.4.5.1 Creating an Incident Manually System-generated problems critical errors generated internally are automatically added to the Automatic Diagnostic Repository (ADR). You can gather additional diagnostic data on these problems, upload diagnostic data to Oracle Support, and in some cases, resolve the problems, all with the workflow that is explained in Section 13.4. Consider creating an incident manually when you encounter an issue, such as software failure or performance problem and you want to gather more diagnostic data, but the Diagnostic Framework has not automatically created an incident. You use the WLST command createincident to create an incident manually. You can specify an incident based on time, a message ID, an impact area, or an ECID. Then, you can inspect the content of the incident or send it to Oracle Support for further analysis. The following describes how to manually create an incident based on a message ID: Search the log files, as described in Section 12.3.2. If you find a message that you suspect is related to the issue you are seeing, you can use the message ID when you create the incident. Use the following commands to invoke WLST, connect to the Managed Server and navigate to the Managed Server instance: java weblogic.wlst connect('weblogic', 'password', 'localhost:7001') cd('servers/server_name') Create the incident, using the createincident command, with the following format: createincident([adrhome] [,incidenttime] [,messageid] [,ecid] [,appname] [,description] [,server]) For example, to create an incident based on the error with the message ID MDS-50500, use the following command, specifying the message ID, and provide a description of the incident to help you and Oracle support track the incident: createincident(messageid='mds-50500', description='sample incident') Incident Id: 55 Problem Id: 4 Problem Key: MDS-50500 [MANUAL] Incident Time: 23rd February 2010 11:55:45 GMT Error Message Id: MDS-50500 Flood Controlled: false If you do not specify a server, the incident collects information from the server to which you are connected. To specify a server, use the server option, as shown in the following example: createincident(messageid='mds-50500', description='sample incident', server='soa_server1') ) If you do not specify the adrhome option, the incident is created in the server to which you are connected. For example, if you are connected to the Administration Server, the incident is created in the adrhome for the Administration Server. The Diagnostic Framework evaluates the command and invokes the appropriate diagnostic dumps. The incident and the diagnostic dumps are written to the ADR. Each diagnostic dump writes its output to the incident. You can view the information about the incident, as described in Section 13.4.2.2. You can view the information in the dumps, as described in Section 13.4.4.
DMS (Dynamic Monitoring Service/System) is an AS component that is used to collect system performance metrics. System components (OBIS, OBIPS, Javahost etc) push system performance metrics to DMS DMS running in WLS accessing Metrics in OPMN managed processes The OPMN MBean mentiond above also provides an MBean operation for process discovery. DMS calls the OPMN process discovery MBean operation every 3 minutes to discover the running OPMN managed processes, including OHS. You can tune the discovery from 3 minutes to some other value if you wish, e.g. 10 seconds. The discovery interval is configured in dms_config.xml which is located under server configuration directory. Metric rows reported by DMS can come from different processes. When a process goes down its metric rows will be removed from all relevant metric tables. Note that there is a time delay for removing metric rows because the same mechanism that discovers new processes (previous paragraph) is also responsible for discovering that processes have terminated. The Oracle Dynamic Monitoring Service (DMS) provides a set of Java and C APIs that measure and report performance metrics, trace performance and provide a context correlation service for Fusion Middleware and other Oracle products. As well as the API's DMS provides interfaces to enable application developers, support analysts, system administrators, and others to measure application-specific performance information. It started life back in Application Server 1.0.2.2 days with the instrumentation API, and has since evolved and incorporated many more features along the way to provide an integrated performance measurement and tracing solution, as well as integrating with other key Oracle software. DMS is split into three main areas: Performance Metrics - This area of DMS consists of a Java and C API for instrumenting code with performance measurements and other useful state metrics, an aggregation language for computing derived metrics, and tooling for accessing the metrics. Using DMS provides Oracle developers with an Oracle-standard mechanism of reporting on the runtime state of their product or component to interested parties such as server administrators or application developers. Execution Context - Execution context is a feature of DMS whose goal is to support the maintenance and