Slide 1
WW TSS-04: Advanced Troubleshooting for Wonderware Application Server Rich Liddell Javier Aldán Technical Account Manager Technical Account Manager Global Customer Support Global Customer Support rich.liddell@invensys.com javier.aldan@invensys.com 2013 Invensys. All Rights Reserved. The names, logos, and taglines identifying the products and services of Invensys are proprietary marks of Invensys or its subsidiaries. All third party trademarks and service marks are the proprietary marks of their respective owners.
Agenda Tools & Technique Install Deploy\undeploy Multi-Galaxy Communication Tech Notes & Tech Alerts Slide 3
Common tools SMC Logger Platform Manager Object Viewer Task Manager MiniDump Windows Event System Files Wonderware Developer Network (WDN) Slide 4
Troubleshooting 101 - What Did You Change? Slide 5
System Management Console (SMC) Slide 6
Object Viewer Locate Process ID Slide 7
Object Viewer Find off scan or quarantined objects: Uncheck Search by Name Check only show objects Slide 8
Object Viewer Find Object ID. Slide 9
Secret Dialog Menu Slide 10
How can I tell if someone deployed something? Did you Are Nope. Deploy? Sure?!? Gobject_Change_Log Objects affected Operation performed User Comment User Logged on Slide 11
Return all operations for past 24 hours SELECT Change.change_date, Change.user_profile_name, Oper.operation_name, user_comment, gobj.tag_name FROM gobject_change_log Change JOIN lookup_operation Oper ON change.operation_id = Oper.operation_id JOIN Gobject GObj ON GObj.gobject_Id = Change.gobject_Id WHERE Change.change_date > DateAdd(hour,-24,getdate()) --and Tag_Name = 'UserDefined_001' --and Oper.operation_name like '%Deploy%' --or Oper.operation_name = 'ModifiedAutomationObjectOnly' ORDER BY change.change_date desc Slide 12
Engine Attributes Slide 13
Automatic MiniDump Generation MiniDump will enable any process from ArchestrA to dump its process information to a dump file if it ever terminates abnormally or hits an exception error. If a Minidump file is generated it will be created automatically at the default path of: <drive>:\program Files\ArchestrA\Framework\minidump The minidump can be quite large depending on the process (200-800mb) Slide 14
Enable MiniDump Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\ArchestrA\Fram ework\debug] "MinidumpEnabled"=dword:00000001 "MaximumDumpFilesAllowed"=dword:0000000a "MinidumpType"=dword:00000003 Slide 15
Tech Note 726 Capturing a Memory Dump File Using the Microsoft Debug Diagnostic Tool (32bit) Slide 16
Wonderware Developer Network https://wdn.wonderware.com Slide 17
Contact Wonderware Via Email:PremiumSupport@Invensys.com Via Phone (US):1-800-WONDER1 (or 1-800-966-3371) (international): 1-949-639-8500 You will need to have a UserID. Slide 18
Install issues.. Slide 19
Unable to Modify install You have successfully installed Go to add modify the install via add\remove programs You then see a quick flash and nothing happens. You uninstall & reinstall and yet the same behavior Slide 20
Missing Install folder C:\Program Files (x86)\common Files\ArchestrA\Install\{F7D0430C-1E73-4546-BB8C-3E23DF991668}\External Slide 21
We need to copy it from the Root of the install: Slide 22
Disable UAC during install Must disable User Account Control (UAC) before installing Failing to do this can result in functional issues Which can lead to re-install *NOTE: UAC also must be disabled on IDE and GR nodes. ArchestrA System Platform 2012 R2 supports UAC-enabled operations on run-time nodes Slide 23
Combating Deploy issues Slide 24
Deploy issues Failed to deploy <PcName>: Failed to get the bootstrap's version The security policies in the customer domain are blocking some unauthenticated RPC calls with anonymous impersonation level. Static cloaking has been enabled for Bootstrap and GR and added code to impersonate the ArchestrA user before deploy/undeploy. Slide 25
Failed to get the bootstrap's version Slide 26
Failed to get the bootstrap's version Slide 27
Known Issue with Windows 7 Client 3.1 SP3 L00124127 3.1 SP3 p01 L00122194 3.5 L00122265 3.6 L00123983 Slide 28
Deployment issues: What can go wrong? DCOM NMX Local mode Versions NIC Binding order aabootstrap not responding aalogger hanging Global Data Cache Platforms still deployed but removed from GR Slide 29
Deployment Troubleshooting Checklist Check Time synchronization between Platform Nodes Configure network binding order when using multiple Networks Disable TCP Offload Engine (TOE) Setup ArchestrA Admin Account on all Platforms OSConfiguration utility 3.1sp3p1 on 2008 sometimes requires the version from 2012 R2 Slide 30
Deployment Troubleshooting Checklist Make sure all Platforms have the same Version and Hotfixes Check Firewall Settings (required Ports are documented in the ReadMe of the Product) Check Tech Note 461, Troubleshooting Bootstrap communication Slide 31
Deployment Logflags Category packages responsible for deployment: PlatformCategory Package responsible for deploying platform engine EngineCategory Package responsible for deploying redundant or non redundant application engines. ApplicationCategory Packages responsible for deploying Areas, DI Objects, and Application Objects Slide 32
Deployment Logflags Components involved in deploy/undeploy process : WWpackageManager component used by aagr clients (IDE, GRAccess) to interact with WWPackageServer. WWPackageServer component running under aagr (service running on Galaxy Repository node), which is used for interacting with database, validation and sorting of the objects that has to be deployed etc. Slide 33
Deployment Logflags Components involved in deploy/undeploy process : Bootstrap service that has to be installed on every IAS node, which among other functionalities is used during platform deployment/un-deployment. Platform Install Manager responsible for installing all code modules on local or remote nodes using MSI. Slide 34
Deployment Logflags Components involved in deploy/undeploy process : File Copy Service Responsible for copying the files to remote nodes. DCOMTransport This is the underlying transport used by the File Copy Service to transfer files between nodes. Slide 35
GPO enabled Unable to deploy/undeploy or configure objects in AppServer v3.6 with customer GPO enabled; access denied due to insufficient permissions in objects' "...\CheckedIn" and "...\CheckedOut" folders Hot Fix L00126108 (3.6) Slide 36
Known Issue Deploy of a redundant engine without cascade causes all running objects to be lost. Hot Fix L00126469(3.6) Slide 37
Global Data Cache distribution aaglobaldatacachemonitorsvr ArchestrA GlobalDataCacheMonitorServices. This service will appear in the Task Manager once a platform is deployed to the machine. This service hands information for the Areas and alarms via the XML, also handles security calls. Slide 38
Global Data Cache Slide 39
Overview Global Data Cache GR Node aabootstrap.exe Remote Platform aabootstrap.exe aaglobaldatacachemonitorsvr Slide 40
Global Data Cache Issue Couldn't get platform name - maybe the platform is not available at this time. IPlatformInformationClerk2::GetPlatformIdentity(Plat formid=xx), hr = 80040405 Platform or Engine mismatch occurred because of non functional Data Cache distribution between the Platforms To resolve the mismatch Problem redeploy the remote Platform Hotfix L00125442 (3.1 SP3 p01) Addressed in 3.6 p01 release Slide 41
Global Data Cache Issue GlobalDataCache folders do not sync if the aaglobaldatacachemonitor service is crashed or restarted. Hotfix L00125643 (3.6) Addressed in 3.6 p01 release Slide 42
Orphaned platforms Connection accepted from address <nodename1>, which differs from existing entry, address <nodename2>. New connection will be denied Root cause is an orphan platform which was removed from the galaxy improperly and is still trying to connect to the Galaxy Identify the Node where the platform is running and remove it by using platform remover Slide 43
Platform Exceed Maximum Heartbeats Slide 44
Platform Exceed Maximum Heartbeats Solution: Setting the proper value in your Platform and AppEngine Configuration Editor Slide 45
Platform Remover (Killer) Run as Administrator Clear out Checkpoint files C:\Program Files (x86)\archestra\framework\bin\checkpointer Clear out Cache folder <RootDrive>\ProgramData\ArchestrA\Cache Slide 46
Platform Remover (Killer) Fails to run when there are more than 100 platforms. Slide 47
Scripting Considerations Using the right script Debugging Logmessage() What is Async for Script Timeout/Error S Invensys proprietary & Inve liconfidential nsy d Slide 48
Engineering Efficiency Script Editor Auto complete function Me MyContainer Scripts Multi level Undo-Redo Line Numbering Consistent color coding Syntax Error Indication Slide 49 2014
Engineering Efficiency Scripting: Exception Handling Trap Exception Handle Exception Slide 50 2014
Let the Engine / Object Relax While First Loading Use a while true script instead of a On true for large tasks (such as IO set reference). Delay with If Script.ExecutionCnt == 2 Slide 51
Use LogMessage() Why have needless Logmessages going to the logger unless required. Always block them in with an IF statement: If me.debug then Logmessage(me.msg); Endif; Slide 52
Async Scripts SQL scripts should always be Async Engine.AsynScriptMaxThread default size is 5 Engine.AsyncScriptsWaitingCnt use this for sizing AsynScriptMaxThread Slide 53
Keep it Clean Slide 54
Keep it Clean WAS Clean-up Guide: Improves time to open templates and objects. Improves time to check-in objects and templates. Deploying the InTouch app is faster. Restoring a Galaxy is faster. Backup was faster and smaller Slide 55
Keep it Clean Tech Note 930 https://wdnresource.wonderware.com/support/k bcd/html/1/t002746.htm Slide 56
Multi-Galaxy Communication? Slide 57
Remote data Symptom: View does not show remote Galaxy data Possible reasons: MxData Service is not deployed Discovery Services are not configured correctly Platform is not deployed on the node where MxDataService is running Remote node is not reachable Slide 58
Secure Write Symptom: Writes do not work from InTouchView when security is enabled Possible reasons: Security mode of Galaxy is set to Galaxy Security Security mode of InTouch is not set to ArchestrA User has not logged into the remote Galaxy at least once Default User Authentication service is not deployed on GR node Security mode of local and remote Galaxies does not match User does not have sufficient permissions to perform the write Remote node is not reachable Slide 59
ASBService OS Account 1. What if the ASBService OS account is not permitted? What account can be used to start the service? 2. Can the ASBService OS account be disabled? Slide 60
ASBService related warnings 3. ASBSecurity Proxy: Connect null FindResponse finding IManageASBSecurity on the SR node The ArchestrA Watchdog service needs to be started before creating a new Galaxy Once the ArchestrA Watchdog service is fixed, the platforms had to be redeployed Slide 61
ASBService related warnings 4. aaservicesdeployagenthost -:- ASBSecurity Proxy: CallDisconnect delegate caught exception The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state. Tech Alert 173 Uninstall / Reinstall product Slide 62
ASBService OS Account Tech Alert 173 Cannot Create a Galaxy or Connect to Any Existing Galaxy After Renaming a Computer if Wonderware Application Server 2012 R2 (Version 3.6) is already installed on the Computer Slide 63
Failed to UnpairWithGR If one of the Galaxies used as a Galaxy Pair in a Multi-Galaxy Configuration is unavailable, the pair cannot be "unpaired." Slide 64
Failed to UnpairWithGR System Platform requires that both paired Galaxies must be present for unpairing to occur cleanly. Outside of seeing the orphaned Galaxy pair in the paired Galaxy list, there is no adverse impact to the system's operation. To reduce orphaned unpaired Galaxies, unpair galaxies before disconnecting from the network. Slide 65
Hotfix When using FSGateway in a multi-galaxy configuration and adding a large number tags to FSGateway using an OPC Client the tags get stuck in an initializing state. Hotfix L00124824 Slide 66
Questions? Slide 67
Latest issues Slide 68
100% CPU on aaengine.exe Engines get stuck at 100% CPU NmxSvc is modified to ensure that it doesn't send incorrect disconnect message to the remote platforms. Hotfix L00124013 (3.5 p01) L00127549 (3.6) *Addressed in 3.6 p01 release. Slide 69
RDI object Bad items that do not exist in the PLC causes RDI to take the AppEngine down over time. Hotfix L00128094 (3.6) Slide 70
Old Alarms Old Alarms showing in Alarm Control They cannot be Acknowledged Hotfix L00127843 (3.6) Slide 71
Tech Alerts TA # 173 Cannot Create a Galaxy or Connect to Any Existing Galaxy After Renaming a Computer if Wonderware Application Server 2012 R2 (Version 3.6) is Already Installed on the Computer Slide 72
Tech Alerts Tech Alert 174 System Corruption Can Result when Importing Object Files (aapkg) Created in a Higher Application Server Version Cannot deploy objects after importing objects developed in 3.1 SP3 P01 to 3.1 SP3 (exists in all version of Application Server up to 2012 R2) Slide 73
Slide 74
Tech Alerts Tech Alert 180 Silenced Alarms are not Logged in the WWAlmDB Database Tech Alert 181 Platform Fails to Deploy on Server 2003 SP2 or XP SP3 Nodes When Using App Server 3.6 P01 Slide 75
Wonderware Developer Network https://wdn.wonderware.com Invensys 2009 Slide 76 Invensys proprietary & confidential Slide 76
Contact Wonderware Via Email:PremiumSupport@Invensys.com Via Phone (US):1-800-WONDER1 (or 1-800-966-3371) (international): 1-949-639-8500 You will need to have a UserID. Slide 77
Questions? Slide 78
Slide 79