Three Simple Ways to Master the Administration and Management of an MDM Hub Jitendra Malhotra Lead Engineer Global Customer Support Avneet Bajwa Senior Engineer Global Customer Support Informatica 1
Breakout Session Includes MDM Components Optimizing Hub Environment Tools & Utilities Effective Problem Resolution of MDM Hub IDD Best Practices 2
MDM Components 3
MDM Components Hub Store: A collection of databases in which business data is stored and consolidated. A Hub Store consists one Master Database and one or more ORS. Hub Server: Manages core and common services for MDM Hub and is a J2EE application, deployed on the application server, that orchestrates the data processing within the Hub Store, as well as integration with external applications. The Hub Console: MDM user interface that comprises a set of administrative and data management tools for administrators and data stewards. Cleanse Match Server: Manages cleansing operations to standardize data, and match server that handles the match operations 4
MDM Components User Interfaces Hub Console JNLP client Custom Client IDD HTML client SIF SDK APIs Application Server tier Hub Server SIF Engine Cleanse Match Server Cleanse Engine Hub Store (Database tier) CMX_ ORS1 CMX_ ORS2 CMX_SYSTEM 5
Hub Environment Components on a Single Machine Components distributed on Multiple Machines 6
Events Process Flow Trust Framework Sources (Reference or Relationship Data) Rules-based Configuration Tools Consumers (Master Reference or Relationship Data) Delta Detection and Cleansing Consolidation Process Data Source ETL Landing Raw f(x) Reject Management Staging Insert/ Update f(x) Apply Trust and Validation Queued for Merging Auto Merge Match Queued for Matching New Target Data Model Name Manual Merge Address Un-Merge Dynamic Cell- Level Survivorship Msg Queue Data Warehouse Data Source Application Msg Queue/ Services f(x) Delta Detection and Cleansing f(x) Insert/ Update Metadata Product Rules Hierarchy Validation State Mgmt Workflow Event Trigger Content History Lineage X Ref Trust Score Audit Application 7
SIF Services Integration Framework (SIF) Applications Bus. Data Director Legacy Composite Portal Oracle SAP Siebel Get Customer Synchronous / Asynchronous (EJB, SOAP, HTTP, JMS) Business Services Process Services Services & Events Generator Business Events New Customer Profile Access Interfaces Get Name Get Address Name Change New Address Schema Specific Services Data Services Generic Services (Design Time) Data Events Security Access Manager (SAM) Multidomain MDM Hub 8
Optimizing Hub Environment 9
Optimizing Hub Environment Database Optimizations 1. Init.ora parameters recommendations Application Server 1. JVM Parameters 2. Connection Pool Sizes and JTA timeout 10
Database Optimization Init.ora parameters Oracle DB Parameters For Baseline Performance memory_target 6000M memory_max_target 6000M The Oracle PGA & SGA sizing should be adjusted according to the memory available on the server Unless otherwise noted the recommendations are for minimum settings, additional resources will improve performance For larger systems, you may need to change the PGA_AGGREGATE_TARGET and SGA_TARGET parameters to extend beyond the required 6GB to make use of the total memory available. Therefore: RAM = O/S + 2 equal amounts for SGA and PGA. 32GB(Total Memory) = 4GB (O/S) + 14GB (PGA) + 14GB (SGA) The setting for Oracle 11g is on assumption of 8GB RAM machine Detail of init.ora setting is given in Knowledge Base article 90408 11
Init.ora recommendations continued.. Informatica Recommended init.ora Parameters Oracle database parameter Non-Default Required Value db_block_checking FALSE db_file_multiblock_read_count Do not set db_cache_size 2000M disk_asynch_io TRUE filesystemio_options SETALL java_pool_size 0 large_pool_size 400M log_buffer 4002816 open_cursors 1000 parallel_adaptive_multi_user TRUE pga_aggregate_target 3000M processes 1000 recyclebin OFF sga_target 4000M shared_pool_size 400M streams_pool_size 0 workarea_size_policy AUTO utl_file_dir ** * db_writer_processes 1or<number of CPUs/8> NOTE: The default init.ora parameters should be used except where noted above 12
Application Server Configuration JVM Setting Xms = 512m Xmx= 2048m PermSize= 256m MaxPermSize= 512m Xss= 2048k JNLP Setting jnlp.max-heap-size=2048m jnlp.initial-heap-size=512m 13
Optimizing Connection Pool Size Max Number of concurrent job * (max thread count +2) So if there are 8 jobs you want to be able to run in parallel and 16 as max thread count, you need to have 8 * (16+2) At least 144 connections available in the datasource connections. It will not normally use all of these but it will be safe for all cases. 14
JTA Timeout Should be set to 600 and Above Information pertaining to Application Server configuration is available at KB article 120187 Link below: https://communities.informatica.com/infakb/solution/18/pag es/120187.aspx?docid=120187&type=external&index=1 15
Tools & Utilities 16
Tools & Utilities Various MDM built-in as well as external tools & utilities can help users determine potential issues within day to day operations of HUB. Some of these tools and utilities are: Enterprise Manager Metadata Manager TEST_IO Utility Memory Analyzer Utility Heap Dump Utility 17
Enterprise Manager View properties Version histories Environment reports for the Hub server Cleanse servers ORS databases Master database 18
Enterprise Manager Enterprise Manager is a powerful tool to determine HUB Environment through Environment Report Utility 19
Metadata Manager Validate Metadata and Repair some metadata Errors Promotes changes from one environment to another Export Metadata from one environment to another 20
Other Useful Utilities TEST_IO Utility MDM Engineering developed utility to help determine database disk_io. Memory Analyzer Memory Analyzer tool helps in monitoring health of applications servers. Heap Dump Analyzer Tools used for analysis of issues related to Out of Memory Server Crashes. 21
TEST_IO Utility One of the important factors of performance is disk I/O speed OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 735 0.01 0.05 0 1 0 0 Execute 738 43.69 76.72 511988 518341 523816 20000488 Fetch 733 0.02 1.22 194 1294 0 489 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2206 43.73 78.00 512182 519636 523816 20000977 Verify that 20mil records were processed. If it was not, something went wrong with the script Good execute elapsed time is between 40-100. Just ok elapsed time is between 100-200. Any elapsed time above 200 needs serious investigation and could have enormous impact on SIF as well as Batch job performance. 22
Application Server Monitoring Application Server Server Memory Live Threads CPU Utilization Class Loaders 23
Heap Dump Analysis Heap Analyzer Histograms Leak Suspects Top Consumers Component Report Dominator Tree 24
Troubleshooting & Resolving Hub Issues 25
MDM Log Files Database debug log Cmxserver.log (server logs) Cmxserver.log (cleanse logs) Alert logs from Oracle Application server logs 26
MDM Components and Common Logs User Interfaces Hub Console JNLP client Custom Client IDD HTML client SiperianCommunicationException Custom Application Logs SIF SDK APIs SiperianClientException Custom Application Logs SiperianServerException Cleanse Server Logs Application Server tier Hub Server SIF Engine Cleanse Match Server Cleanse Engine SiperianServerException Hub Server Logs Hub Store (Database tier) CMX_ ORS1 CMX_ ORS2 SiperianServerException Database Logs CMX_SYSTEM 27
MDM Log Analysis Scenario 1 Customer Reports IDD throwing an exception, while performing a merge Support is provided with MDM Server logs in debug Mode Initial analysis point out the issue to be on the database side of MDM Support Request for the database debug logs from that time frame After reviewing them the issue is found within a product and an Fix is provided 28
MDM Log Analysis 19-APR-2012 15:57:46.621[ERROR][sid:200][Preview_bvt:Generate_bvt... 79 package body CMXBV.1279] Autonomous SQL Error (-955).SQLERRM is: ORA-00955: name is already used by an existing object 19-APR-2012 15:57:46.676[ERROR][sid:200][Preview_bvt:Generate_bvt... 93 package body CMXBV.1293] Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_B" has errors 19-APR-2012 15:57:46.781[ERROR][sid:200][Preview_bvt:Generate_bvt... 06 package body CMXBV.1306] Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has errors 19-APR-2012 15:57:46.782[ERROR][sid:200][Preview_bvt:Generate_bvt... 16 package body CMXBV.1316] Autonomous SQL Error (-4063). SQLERRM is: ORA-04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has errors 19-APR-2012 15:57:46.797[DEBUG][sid:200][Preview_bvt:Generate_bvt... 19 package body CMXBV.1319] CMX_ORS.T$1_BE0EA5683999CFB2E040_b dropped. 19-APR-2012 15:57:46.805[DEBUG][sid:200][Preview_bvt:Generate_bvt... 20 package body CMXBV.1320] CMX_ORS.T$1_BE0EA5683999CFB2E040_g dropped. 19-APR-2012 15:57:46.806[DEBUG][sid:200][Preview_bvt:Application... 09 package body CMXUT.2709] Module Name: Generate_bvt*****Exception Name: ERROR_BUILD_TABLE1 19-APR-2012 15:57:46.859[DEBUG][sid:200][Preview_bvt:Generate_bvt... 09 package body CMXUT.2709] Module Name: Preview_bvt*****Exception Name: NO_BVT_FOR_RECORD 19-APR-2012 15:57:46.869[DEBUG][sid:200][Preview_bvt:Preview_bvt... 86 package body CMXBV.2686] 0,SIP0: No BVT available for rowid_object 132057, SIP-28241: Error creating BV1 working table for C_PARTY: ORA- 04063: view "CMX_ORS.T$1_BE0EA5683999CFB2E040_G" has errors 29
MDM Log Analysis Scenario 2 Customer reports an IDD operation running for a long time and throwing an IDD error message Support request CMX Server logs After reviewing the sever logs and reviewing the server load, supports recommended thread pool queue to be increased. 30
MDM Log Analysis [2012-04-10 12:04:53,022] [http-0.0.0.0-8080-18] [SEVERE] javax.enterprise.resource.webcontainer.jsf.application: org.springframework.core.task.taskrejectedexception: Executor [java.util.concurrent.threadpoolexecutor@43ee60d2] did not accept task: com.siperian.dsapp.jsf.ui.dsbean.dataview.savehandle.compositeasyncsaveoperation@1d002224; nested exception is java.util.concurrent.rejectedexecutionexceptionjavax.faces.el.evaluationexception: org.springframework.core.task.taskrejectedexception: Executor [java.util.concurrent.threadpoolexecutor@43ee60d2] did not accept task: com.siperian.dsapp.jsf.ui.dsbean.dataview.savehandle.compositeasyncsaveoperation@1d002224; nested exception is java.util.concurrent.rejectedexecutionexception at javax.faces.component.methodbindingmethodexpressionadapter.invoke(methodbindingmethodexpressionadapter. java:102) at com.sun.faces.application.actionlistenerimpl.processaction(actionlistenerimpl.java:102) at javax.faces.component.uicommand.broadcast(uicommand.java:387) at org.ajax4jsf.component.ajaxactioncomponent.broadcast(ajaxactioncomponent.java:55) at com.exadel.siperian.component.sipuiajaxcommandbutton.broadcast(sipuiajaxcommandbutton.java:25) 31
IDD Best Practices SAM, HM Configuration IDD Design User Exits Sizing 32
ESB, EAI, MOM, ETL, SQL, JCA, JNI ESB, EAI Business Data Components Library and Business Data Components Platform Deliver Reliable and Relevant Data In Real-time Through Standard & Composite Applications Business Data Components Library Applications VisualForce Salesforce Applications Hierarchy Potential Matches NetWeaver SAP Match Comparison Point-in-Time History Merge Cross References VBC (HTTP) Siebel Web Part Sharepoint Legacy Platform Build Component Build Component Portlet Portals iframes Others CIF Legacy Systems Informatica MDM Application Users Third Party Data Multidomain MDM Match and Merge Business Data Director Relationships History / Lineage Create Consume Manage Monitor 33
IDD Best Practices Configuration Security Access manager Hierarchy Manager Sizing Design IDD Configuration ORS Design User Exits Correct Entry Points Optimization of Custom Code SIF API 34
IDD Best Practices Security Access Manager MDM access is broken into 6 categories Read, Write, Update, Merge, Delete, Execute Each Resource related directly or indirectly to a IDD operation shall be given the required access IDD implementation guide has good refrences for SAM configuration 35
IDD Best Practices Hierarchy Manager HM Configuration in the Hub HM Profile Validation IDD Configuration for Single and Multiple Hops 36
37