I am writing this blog and others to explain how things work and some ways deployment and operational tasks can be handled. In other words, these postings are for demonstration purposes only. Since I am not familiar with your organization or environment I do not know if these steps are applicable to your environment or are even safe to perform in your environment. It is recommended that you contact Microsoft Support prior to making changes in your environment to ensure that these steps are applicable to your environment, and are safe to perform in your environment. By writing this blog I am in no way recommending that you perform these steps in your own environment. If you choose to follow the steps outlined in this or other blog postings on this site, you are assuming the risk for your actions.
1 Troubleshooting Active Directory Replication
Repadmin is a tool for checking replication status and troubleshooting replication issue. Below is a table highlighting commonly used syntax of the repadmin tool.
|Repadmin /replsummary||The replsummary operation quickly and concisely summarizes the replication state and relative health of a forest.|
|Repadmin /replsummary /bysrc /bydest /sort: delta||The replsummary operation quickly and concisely summarizes the replication state and relative health of a forest.|
|Repadmin /showrepl <DC Name>||Displays the replication partners for each directory partition on the specified domain controller. Helps the administrator build a visual representation of the replication topology and see the role of each domain controller in the replication process.|
|Repadmin /showutdvec||Displays the highest Update Sequence Number (USN) for the specified domain controller. This information shows how up-to-date a replica is with its replication partners.|
|Repadmin /showobjmeta <DC> <DN of object>||Displays the replication metadata for a specified object stored in Active Directory, such as attribute ID, version number, originating and local Update Sequence Number (USN), and originating server’s GUID and Date and Time stamp. By comparing the replication metadata for the same object on different domain controllers, an administrator can determine whether replication has taken place.|
|Repadmin /showconn||Displays the connection objects for a specified domain controller. Default is local site.|
|Repadmin /replsingleobj <DC List> <Source DSA Name> <Object DN>||Replicates a single object between any two domain controllers that have partitions in common. The two domain controllers do not have a replication agreement. Replication agreements can be shown by using the Repadmin /showrepl command.|
|Repadmin /replicate <Destination_DC_List> <Naming Context>||Starts a replication event for the specified directory partition between the source and destination domain controllers. The source UUID can be determined when viewing the replication partners by using the Repadmin showrepl operation.|
|Repadmin /syncall <DC>||Synchronizes a specified domain controller with all replication partners.|
|Repadmin /queue||Displays tasks waiting in the replication queue.|
|Repadmin /showmsg <Error>||Displays the error message for a given error number.|
|Repadmin /viewlist <DC_List>||Displays a list of domain controllers.|
|Repadmin /showctx <DC_List>||Displays a list of computers that have opened sessions with a specified domain controller.|
|Repadmin /showcert||Displays the server certificates loaded on a specified domain controller.|
|Repadmin /removelingeringobjects <Dest_DC_List> <Source DC GUID> <NC> [/ADVISORY_MODE]||Uses an authoritative domain controller to compare the directory of a domain controller (destination) that is suspected of having lingering objects against the directory of a domain controller (source) that is designated as a reference source for up-to-date values for the domain of the destination. When the advisory mode parameter is used, this command provides a list of found lingering objects. When the advisory mode parameter is not used, this command removes lingering objects from the destination domain controller.|
Additional information on Repadmin.exe is available here: https://technet.microsoft.com/en-us/library/cc736571(v=ws.10).aspx
1.2 Repadmin /replsummary
As seen in the screenshot below repadmin /replsummary will give statistics for replication with replication partners. The output also lists any errors that were encountered with replication. This is useful for getting an overview of any replication issues the DC is having.
You can also sort the output. In the example below, the output is sorted by the largest delta since last replication.
1.3 Repadmin /showrepl
As seen below repadmin /showrepl shows the replication status with all of the DCs replication partners and is sorted by the Naming Context that is being replicated.
One trick that can be used to get a more manageable output is to use repadmin to send its output to a CSV and the use PowerShell to convert the CSV to a GridView. The command to do this is repadmin /showrepl * /csv | ConvertFrom-CSV | Out-GridView
The resulting output is in a manageable GUI.
In GridView you can sort and filter. Below is an example of filtering on Number of Failures, so that I can easily see what failed.
1.4 Repadmin /showutdvec
Replications changes are tracked through incrementing numbers called USNs. There are times where you will want to know what knowledge each DC has about other DCs current state. The up-to-dateness vector is the knowledge that a DC as about the current state of other DCs. This information can be useful when trying to troubleshoot replication issues such as USN Rollback. USN Rollback is when a DC is restored from an unsupported method such as a snapshot. In that case the up-to-dateness vector would be much larger than the actual USN of the DC. Since, there is going to be some delay in replication you will notice some differences but the numbers should be relatively close. For example, if you compare the up-to-dateness vector for DC01 across DCs you will notice the following: for itself DC01 has USN of 17347, DC02 has a USN of 17346 for DC01, and DC03 has a USN of 17346 for DC01. So, we can see the numbers are relatively close and that DC01 potentially has one change that it needs to replicate to DC02 and DC03.
1.5 Repadmin /showobjmeta
The /showobjmeta switch shows detailed information for attributes of an object. It is most commonly used when comparing the output of the command from 2 DCs to see if they are in sync and the current status of the attributes. Differences can be used to identify replication problems.
1.6 Repadmin /syncall
Repadmin /syncall is used to force replication between domain controllers. You can easily view options for the /syncall switch with the following command: repadmin /syncall /?
A normal use of repadmin /syncall is with the /AeP switch
1.7 Repadmin /showmsg
The /showmsg switch is used to convert an error message you may receive as the result of a repadmin command and converts it to human readable text.
1.8 Repadmin /viewlist
Repadmin /viewlist is used to get a list of domain controllers.
PowerShell is an object oriented scripting language that allows enterprises to automate IT tasks.
Below is a conversion table that shows the PowerShell command that can be used in place of the Repadmin command. So, why would you choose to use PowerShell? The output of PowerShell commands are objects those objects can be filtered with properties, piped through other PowerShell commands and manipulated to many useful things including great control in how the data is presented to the user.
|Repadmin /Set Attr||Set-ADObject|
Get-ADReplicationParnerMetadata is very similar to running repadmin /showrepl. Without passing the output through another cmdlet the formatting is a bit different then to what you get with repadmin.
However, the advantage is that the output of the command are objects. You can constrain your views to certain properties.
The other advantage is that you can pass objects through other cmdlets. As seen here I am passing the output of Get-ADReplicationPartnerMetadata through Output-GridView.
Once in GridView you have the ability to sort and filter the data.
Here is another example of the usefulness of using PowerShell over repadmin. In this example I take the output of Get-ADReplicationPartnerMetadata then passing it through Select-Object so that we can then limit what objects are presented in GridView.
Here we see the output of that command.
1.10 Replication Errors
Here is a list of replication errors you may come across in either the Directory Services event log or while running repadmin.
|Event ID||Replication Error||Issue|
|1925||DNS Lookup Issues or Connectivity Problems|
|2087||DNS Lookup Issues|
|2088||DNS Lookup Issues|
|1311||Replication Topology Issues|
|8614||Tombstone lifetime exceeded|
|8524||DNS Lookup failure|
|8456||Server is currently rejecting replication requests|
|8457||Server is currently rejecting replication requests|
|8453||Access was denied|
|8452||The naming context is in the process of being removed or is not replicated from the specified server|
|5||Access is denied|
|-21468930222||The target principal name is incorrect|
|1753||There are no more endpoints available from the endpoint mapper|
|1722||The RPC server is unavailable|
|1396||Logon Failure The Target account name is incorrect|
|1256||The remote system is not available|
|1127||While accessing the hard disk, a disk operation failed even afer retries|
|8451||The replication operation encountered a database error|
|8606||Insufficient attributes were given to create an object|
2 Troubleshooting Steps for Common Replication Issues
2.1 Troubleshooting -21468930222 (The target principal name is incorrect)
On the DC that is the cause of the error, perform the following steps:
Step 1: Open Services.msc
Step 2: Configure KDC Service for Manual
Step 3: Stop the Service
Step 4: Restart the Domain Controller
Step 5: Open PowerShell as an Administrator
Step 6: Run: $cred = Get-Credential
Step 7: Enter Credentials and click OK
Step 8: Run, Reset-ComputerMachinePassword –Server <ServerName> -Credential $cred
Step 9: Restart the server
Step 10: Set the KDC service to Automatic, Start the service and click OK.
2.2 Troubleshoot Replication Error 8606, Event ID 1388, and Event ID 1988
These issues are caused by lingering objects. Lingering objects can be caused when a domain controller is taken offline for an extended period of time, does not replicate for longer than the tombstone lifetime, or is restored from a backup that is older than the tombstone lifetime.
When an object is deleted it is put in a tombstone state. After the tombstone lifetime passes (typically 180 days), DC run garbage collection and those tombstone objects are deleted. If a DC was offline for the entire TSL and then were brought back online they may have objects that have since been deleted, tombstoned, and garbage collected. Any objects that were deleted will still exist on that DC. These objects go unnoticed until a change is made to that object then the DC attempts to replicate that object, and at that point that is where it is either re-introduced into the environment or if strict replication consistency is enabled, blocked.
2.2.1 How to Determine TSL
Run the following command: dsquery * “cn=directory service,cn=windows nt,cn=services,cn=configuration,<Forest DN>” –scope base –attr tombstonelifetime
2.2.2 How to Remove Lingering Objects
184.108.40.206 Repadmin /removelingeringobjects
One way to remove lingering objects is to user repadmin with the /removelingeringobjects switch. First you must identify a clean source of the partition. The syntax of the command is repadmin /removelingeringobjects <Dest DC Name> <Source DC Guid> <Naming Context>. So, in other words you need to identify the source DCs guid and the Naming Context you want to clean. The naming context will be available in the Event 1388 or 1988 you receive in the event long. Once you find a clean source you can obtain the guid by opening DNS Manager and opening up the _msdcs Zone and obtaining the CName record for the DC in question.
Below is an example of running the repadmin /removelingeringobjects command
You will receive an Event 1937 when the removal of lingering objects begins.
You will then receive an Event 1939 when removal completes.
220.127.116.11 Repadmin /rehost
An alternative to using repadmin /removelingeringobjects command is to unhost the partition so that the domain controller no longer has that partition and then rehosting the entire partition with a good source.
The repadmin syntax for unhosting the partition is repadmin /unhost <DC Name> <Partition Name>
You will receive an event an event 1658 when the removal begins.
You will receive an event 1660 when the removal completes
The syntax for rehosting the partition is: repadmin /rehost <Dest DC Name> <Partition> <Source DC Name>
2.3 Troubleshooting Event ID 2042
Review event log for any 1988 or 1388 errors. If found use the previous section to remove the lingering objects from the domain controller.
Option 1: Re-hosting the partition that has not replicated
If the partition is a GC partition consider unhosting and rehosting the partition. Instructions for unhosting and rehosting are in the previous section called Repadmin /rehost
Option 2: Removing and then re-adding the domain controller to Active Directory
Another option is removing the DC from Active Directory and Re-promoting the Domain Controller
Step 1: Run Import-Module ADDSDeployment
Step 2: Run: Uninstall-ADDSDomainController –DemoteOperationMasterRole:$true –Force:$true
Step 3: Enter and confirm the new local password
Step 4: Next you will need to run the Install –ADDSDomainController cmdlet. Below is a sample that you can use. You will need to modify the template to meet the requirements of your environment.
Install-ADDSDomainController –NoGlobalCatalog:$false –CreateDnsDelegation:$false –CriticalReplicationOnly:$false –DatabasePath “C:\Windows\NTDS” –DomianName “fabrikam.com” –InstallDNS:$true –LogPath “C:\Windows\NTDS” –ReBootOnCompletion:$false –ReplicationSourceDC “DC01.fabrikam.com” –SiteName “Default-First-Site-Name” –SysvolPath “C:\Windows\SYSVOL” –Force:$true
Option 3: Enabling Replication with Divergent and Corrupt Partner
Due to the risk of adding lingering objects to Active Directory the final consideration should be enabling the following setting: Allow Replication With Divergent and Corrupt Partner.
Step 1: To enable this setting run the following command on the domain controller:
repadmin /regkey <hostname> +allowDivergent
Step 2: Let replication complete
Step 3: Disable the setting with the following command: repadmin /regkey <hostname> -allowDivergent
2.4 Troubleshooting Event ID 1311
Event 1311 is caused when there is not complete connectivity between domain controllers. There are a number of reasons there may not be complete connectivity.
The Inter-Site Topology Generator (ITSG) is responsible for building the replication topology. So to determine what the scope of the connectivity issues it is important to identify the ISTGs that are logging 1311.
To find the ISTGs in your environment you need to use ldp.exe
Below are the steps for locating the ISTGs:
Step 1: Launch ldp.exe
Step 2: When LDP opens, select Connection and then Connect…
Step 3: In the Connect dialog box, enter the name of a Domain Controller for the Server you want to connect to and then click OK
Step 4: Click on Connection and then click Bind…
Step 5: In the Bind dialog box, click OK
Step 6: Select the Browse menu and then select Search
Step 7: In the search enter the following:
Base DN: CN=Sites,CN=Configuration,<DN of Forest Root> (example: CN=Sites,CN=Configuration,DC=fabrikam,DC=com)
Filter: (CN=NTDS Site Settings)
Attributes: Append the following to the attributes that are already listed: ;interSiteTopologyGenerator
Step 8: Click Run
Step 9: For each site you will then need to look for interSiteTopologyGenerator to determine the ITSG for each site.
By default, Bridge All Site Links (BASL) is enabled in Active Directory. If your environment is not fully routed, then you will want to disable BASL. By fully routed we mean each site can contact every other site. If BASL is configured on a network which is not fully routed, the KCC will generate site bridges that cannot actually be reached. To determine if BASL is enabled launch Active Directory Sites and Services (dssite.msc).
Expand Sites, then Inter-site Transports.
Right-click on IP and select Properties from the context menu
If Bridge all site links is enabled, there will be a check box next to it. To disable BASL, uncheck the checkbox and click OK.
2.4.3 Site Link Bridges
If you disable BASL you can still bridge site links. You would do that if you wanted two spoke sites to communicate directly if they could not communicate with the hub site. In a hub and spoke configuration the cost of crossing to site links (bridging a site link) will typically be a higher then just connecting directly to the hub site. So, ordinarily you would not have to worry about the Site Link Bridge being used instead of a direct site link. That being said, there are not a whole lot of scenarios where you would need to create Site Link bridges.
The following steps will allow you to bridge two site links.
Step 1: Open the Active Directory Sites and Services MMC.
Step 2: Expand Sites and then expand Inter-site Transports
Step 3: Select New Site Link Bridge… from the context menu
Add at least two sites to the Site Link Bridge, give it a Name, and click OK
And the Site Link Bridge has been completed.
2.4.4 Verify that all Sites are in a Site Link
Step 1: Run the following command in a PowerShell Console: Get-ADObject –LDAPFilter ‘(objectClass=site)’ –SearchBase (Get-ADRootDSE).ConfigurationNamingContext –Property Name | Format-Table Name
Step 2: In another PowerShell Console run: Get-ADObject –LDAPFilter ‘(objectClass=sitelink)’ –SearchBase (Get-ADRootDSE).ConfigurationNamingContext –Property Name, Cost, Description, Sitelist | Format-List Name, Sitelist
Step 3: Verify that each site that was listed in Step 1 exists in one of the site lists returned in Step 2
If not all sites are contained in a site link that you need to determine what site link that site needs to be added to or if a new site link needs to be created.
And that is all I have for replication troubleshooting for today.