Module 10: Maintaining Active Directory! Lesson: Backing Up Active Directory Topic: How to Back Up Active Directory! Lesson: Restoring Active Directory Topic: How to Perform a Primary Restore! Lesson: Planning for Monitoring Active Directory Topic: Events to Monitor Topic: Performance Counters to Monitor
2 Module 10: Maintaining Active Directory Lesson: Backing Up Active Directory Topic: How to Back Up Active Directory Procedure To back up the system state data by using the Ntbackup command-line tool, perform the following steps:! Open the command-prompt and type ntbackup backup systemstate For example, to create a backup named Backup Job 1, which backs up the system state data to the file C:\backup.bkf, type the following: ntbackup backup systemstate /J "Backup Job 1" /F "C:\backup.bkf" All other options default to those specified in the Backup program. To view the complete syntax for the ntbackup command, at a commandprompt, type ntbackup /? If you do not specify the other Backup options, ntbackup uses the Backup program s default values for the backup type, the verification setting, the logging level, the hardware compression, and other settings.
Lesson: Restoring Active Directory Topic: How to Perform a Primary Restore Additional options for a primary restore Module 10: Maintaining Active Directory 3 Typically, you perform a primary restore only when all the domain controllers in the domain are lost, and you are trying to rebuild the domain from the backed-up data. To perform a primary restore, under the Advanced Restore options, you can select the When restoring replicated data sets, mark the restored data as the primary data for all replicas option; however, you can also use the following additional options. The following table lists the additional options that you can select to perform a primary restore of distributed data. Option Restore security Restore junction points, and restore file and folder data under junction points to the original location When restoring replicated data sets, mark the restored data as the primary data for all replicas Restore the Cluster Registry to the quorum disk and all other nodes Preserve existing volume mount points Description Restores security settings for each file and folder. Security settings include permissions, audit entries, and ownership. This option is available only if you have backed up data from an NTFS volume that is used in the Microsoft Windows Server 2003 family and you are restoring it to an NTFS volume that is used in the Windows Server 2003 family. Restores junction points on the hard disk, as well as the data that is being pointed to by the junction points. If you do not select this check box, the junction points are restored as common directories, and the data that is pointed to by junction points is not accessible. Also, if you are restoring a mounted drive, and you want to restore the data that is on the mounted drive, you must select this check box. If you do not select this check box, you can only restore the folder that contains the mounted drive. Ensures that the restored File Replication Service (FRS) data is replicated to the other servers. Select this option only when you restore the first replica set to the network. Do not use this option if one or more replica sets have already been restored. Ensures that the cluster quorum database is restored and replicated on all the nodes in a server cluster. If you select this option, Backup stops the Cluster service on all the other nodes of the server cluster, after the node that was restored is restarted. The entire server cluster will therefore be down during an authoritative restore of the data on the cluster quorum disk. Prevents the restore operation from overwriting any volume mount points that you have created on the partition or volume that you are restoring data to. This option is primarily applicable when you are restoring data to an entire drive or partition.
4 Module 10: Maintaining Active Directory Lesson: Planning for Monitoring Active Directory Topic: Events to Monitor Examples of events for the domain controller on the network The events that are listed in the following table indicate when the event log service has started and shut down, and if a domain controller has a problem registering Domain Name System (DNS) name records. Use these events to determine when the system was running. The DNS events indicate that the DNS name resolution is critically important throughout the environment. Event log Source Event Description System EVENTLOG 6005 Event log service has started, often because the computer has restarted. System EVENTLOG 6006 Event log service has shut down, often because the computer has restarted. System DNSAPI 11154, 11166 Domain controller does not have sufficient rights to perform a secure dynamic update. System DNSAPI 11150, 1162 DNS server has timed out. System DNSAPI 11152, 11153, 11164, 11165 System DNSAPI 11151, 11155, 11163, 11167 The zone or the currently connected DNS server does not support dynamic updates. A resource record for the domain controller is not registered in DNS. System NETLOGON 5773 One or more domain controller locator records are not registered because the primary DNS server does not support dynamic updates. System NETLOGON 5774 One or more domain controller locator records are not registered in DNS. Examples of events for The following events indicate problems with core Microsoft Active Directory core Active Directory directory service functionality. functionality Event log Source Event Description Directory Service All Sources Severity = error The primary error events for Active Directory. System LSASS Severity = error Local Security Authority (LSA) is the core security subsystem for Active Directory.
Module 10: Maintaining Active Directory 5 Examples of events for replication The following events may indicate problems with SYSVOL replication or the application of Group Policy. Event log Source Event Description FRS All Sources Severity = error FRS is used to synchronize policy between all the domain controllers in the forest. Application USERENV Severity = error User = System Application SCECLI Severity = error 1058 Responsible for the application of Group Policy and profiles on domain controllers. Security Configuration Engine error messages. Often a transitory problem. Alerts if there are more than 5 events in 30 minutes. Examples of events for authentication The following events may indicate problems with the maintenance of uniform time throughout the Active Directory forest, with Kerberos, with the default authentication protocol, and with the Net Logon service and the protocol that is required for proper domain controller functionality. Event log Source Event Description System W32TIME Severity = error Severity = warning System Kerberos V5 Key Distribution Center (KDC) Severity = error, Report 11 weekly System NETLOGON Severity = error, Report 5705, 5723 weekly FRS is used to synchronize policy between domain controllers. Critical KDC service error messages. Critical NETLOGON service errors.
6 Module 10: Maintaining Active Directory Topic: Performance Counters to Monitor Performance counters The following performance counters monitor the quantity of replicated data. for monitoring the Thresholds are determined by the baselines that you have already established, quantity of replicated unless otherwise indicated. data Object Counter Interval Description NT directory service () object DRA Inbound Bytes Compressed DRA Outbound Bytes Compressed 15 minutes Indicates the amount of replication data that flows to the site. A significant change in the counter indicates a replication topology change or that significant data was added or changed in Active Directory. 15 minutes Indicates the amount of replication data that flows out of the site. A significant change in the counter indicates a replication topology change or that significant data was added or changed in Active Directory. DRA Outbound Bytes Not Compressed Outbound Bytes Total/sec 15 minutes Indicates the amount of replication data that is outbound from the domain controller within the site. 15 minutes Indicates the amount of replication data that is outbound from the domain controller. A significant change in the counter indicates a replication topology change or that significant data was added or changed in Active Directory.
Module 10: Maintaining Active Directory 7 Performance counters for monitoring core Active Directory functions and services The following performance counters monitor the core Active Directory functions and services. Thresholds are determined by baselines that you have already established unless otherwise indicated. Object Counter Interval Description Process DS Search suboperations/sec % Processor Time LSASS LDAP Searches/sec LDAP Client Sessions 15 minutes Indicates domain controller performance problems if there is any significant change in this counter. Check to see if applications are incorrectly targeting the domain controller. 1 minute Indicates the amount of CPU that is used by Active Directory. 15 minutes Indicates the amount of overall use for a domain controller. Ideally, this counter is fairly uniform across the domain controllers. This counter may indicate that a new application is targeting the domain controller or that more clients were added to the network. 5 minutes Indicates the number of clients that are currently connected to the domain controller. A significant change may indicate that computers are failing over to the domain controller. By establishing a baseline on this counter, you can also collect useful information about what time of day people are connecting, and the maximum number of client computers that connect each day. LSASS Private Byte 15 minutes Used for establishing baseline memory requirements by domain controller. If the counter registers a continual increase, workstation demand has increased, applications are not working correctly (not closing handles), or an increased number of workstations are targeting the domain controller. When this counter significantly deviates from the normal value of other peer domain controllers, you should investigate the source of the demand. Process Process Handle Count LSASS Private Bytes LSASS 15 minutes By comparing statistics on this counter with baselines measurements, you can see if applications are not working correctly and not closing handles properly. The statistics increase linearly as client workstations are added. 15 minutes By comparing statistics on this counter with baselines measurements, you can see when Active Directory is running low on virtual memory address space, which may indicate a memory leak. Use this counter to verify if you are running the latest service pack and schedule a restart during off-hours to avoid a system outage. Set an alert when the counter value exceeds 2 gigabytes.
8 Module 10: Maintaining Active Directory Performance counters for monitoring key security volumes The following performance counters monitor key security volumes. Thresholds are determined by baselines that you have already established, unless otherwise indicated. Object Counter Interval Description Kerberos NTLM Authentications/s ec KDC AS Requests/sec Authentications/s ec KDC TGS Requests 15 minutes Indicates the number of clients that are authenticating against the domain controller by using NTLM instead of Kerberos (Microsoft Windows NT 4.0 and earlier clients or interforest authentications). 15 minutes Indicates the number of session tickets being issued by the KDC. This counter is a good indicator to observe the impact of changing the ticket lifetime. 15 minutes Indicates the amount of authentication load that is placed on the KDC. This counter is helpful for establishing baselines. 15 minutes Indicates the number of ticket granting tickets being issued by the KDC. This counter is a good indicator to observe the impact of changing the ticket lifetime. Performance counters The following performance counters monitor core operating system indicators for monitoring core and have a direct impact on Active Directory performance. operating system Object indicators Counter Interval Threshold Description Memory Page Faults/sec 5 minutes 700/s Indicates a lack of physical memory if there is a high rate of page faults/sec. Physical Disk Processor System Memory Processor System Current Disk Queue Length % DPC Time _Total (instance) Processor Queue Length Available MBytes % Processor Time - _Total Context Switches / sec 1 minute 2 Averaged over 3 intervals Indicates a backlog of disk I/O requests. Consider increasing disk and controller throughput. 15 minutes 10 Indicates work that was deferred because the domain controller was too busy and can indicate processor congestion. 1 minute 6 Averaged over 5 intervals Indicates that the CPU is not fast enough to process requests as they occur. If the replication topology is correct and the condition is not caused by failover from another domain controller, consider upgrading your CPU. 15 minutes 4 MB Indicates that if threshold is met, system has run out of available memory. Imminent service failure is likely. 1 minute 85% Averaged over 3 intervals Indicates CPU is overloaded. Determine if CPU load is being used by Active Directory by examining the "% Processor Time - LSASS" counter. 15 minutes 70,000 Indicates excessive transitions. There may be too many applications or services running, or their load on the system is too high. Consider off-loading this demand. System System Up Time 15 minutes Measures domain controller reliability