VM snapshots and AD domain membership

The trust relationship between this workstation and the primary domain failed.
Virtual machines are very popular in the areas of software development, packaging or testing, because the snapshot feature allows it to keep certain known states of a VM (e.g. a fresh and clean OS install) and return to these very easily. But there are some issues associated to using snapshots that you should be aware of, e.g. this one: If you need to join your Windows test VMs to an Active Directory (AD) domain and send them back in time regularly (by reverting them to a snapshot that you created several days or weeks ago) then you are probably familiar with the annoying effect that the machines eventually "drop off the domain".

In this blog post I will provide a way to
  a) initially prevent this, and
  b) fix it in an automated way if it cannot be avoided.

But first some technical background information about what happens here and why: A machine that was joined to an AD domain has a computer account in this domain and maintains a password for this account. It will store a copy of this password locally in its registry and another copy will be kept on the domain controllers. When authenticating against a domain controller it will present its locally stored password, and only if this matches the copy stored in AD the access is granted. If the passwords do not match then the "trust relationship" of the machine is broken and it will no longer be able to access domain resources. This means e.g. that you will no longer be able to log on to the machine with a domain account - instead you will be presented the error message displayed above.

By default a domain member machine will change its password every 30 days and also update the AD stored copy to match it. It is important to understand that this process is purely client-driven. A machine password does not expire in AD. Even if a machine never changes it the domain controllers will not complain about that as long as the presented password matches the copy stored in AD.

Now what happens if a machine is reset to a snapshot? The machine will then use the locally stored password that was valid at the time when the snapshot was created, although it might have changed it in the meantime. No matter how old the snapshot is: if the password stored in the snapshot does not match the current AD stored password then the machine's trust relationship is broken, and the machine has "dropped off the domain".

There is an easy fix to prevent this: There are Group Policy Settings available that can be used to adjust the frequency of the password change, but also to disable it completely. You can configure the setting that is illustrated below either in the local Group Policy of the machine or in a Domain GPO that is applied to this machine:

Disable machine password change in Group Policies
Change the highlighted setting to Enabled and the machine will never again change its password.

However, for security reasons this is not recommended by Microsoft, and there is a high probability that your AD administrators have disabled this possibility by forcibly disabling this policy setting in the Default domain policy or another superior GPO. What can you do then?

If we cannot prevent the machine's trust relationship from breaking then we need to find a way to automatically fix it whenever the machine is rebooted from a snapshot. There are at least two ways to do this: The most commonly known and used tool to reset the computer account's password is Microsoft's netdom.exe. The version for Windows 2003 is included in the Support Tools package, for newer Windows versions it is in the Remote Server Administration Tools (RSAT). Usage descriptions are e.g. in this Technet article. In Microsoft resources you are usually told that netdom resetpwd only works for domain controllers, but this is not true - it also works fine on domain member machines.

However, I recently discovered a second way to reset the machine's password, and I definitely prefer this method, because it is based on Powershell and also provides an easy way to check whether the trust relationship is really broken and needs to be fixed. Please warmly welcome the Powershell cmdlet Test-ComputerSecureChannel!

Using this cmdlet without any parameters will return true if the machine's account is okay and false if its trust relationship is broken. And if it's broken then you can use the parameter -Repair to fix it. This requires an account that has local administrative permissions and permissions in AD to change the machine's password. The credentials of this account can be passed as a PSCredential object following the parameter -Credential. Here is an example of a Powershell script that will reset a machine's password, but only if it is necessary:
If (!(Test-ComputerSecureChannel)) {
    "Repairing Domain Trust Relationship ..."
    $username = "domain\service_account"
    $plainTextPassword = "secret"
    $SecurePassword = $plainTextPassword | ConvertTo-SecureString -AsPlainText -Force
    $Credentials = New-Object System.Management.Automation.PSCredential -ArgumentList $UserName, $SecurePassword
    Test-ComputerSecureChannel -Repair -Credential $Credentials
}
For the credentials you will typically use a service account with a non-expiring password. If you have automated the build of your computers then you are probably already using such an account for joining them to the domain, and then it is perfectly okay to use the same account here.
Please note: The Test-ComputerSecureChannel cmdlet is available since Powershell version 2.0, but the -Credential parameter was first added in version 3.0, so you want to make sure that you have at least this version installed on your machine.

Those who are worried about the script containing the password in clear text can also store it in encrypted format into a file and load and decode it from there at runtime. There are numerous code samples available for doing this, e.g. here by Hal Rottenberg.

One reliable way to execute the script from a command prompt or shortcut is using the following command line:
%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe -ExecutionPolicy Unrestricted -File Repair-ComputerSecureChannel.ps1
The last challenge now is to make this script execute at each boot of your snapshotted machines. Again there are multiple ways to achieve this:
  • Force AutoLogon of a local(!) administrative user that has the script linked into its Autostart folder
  • Create a scheduled task that runs at boot time and executes the script
  • Use srvany (from the Windows 2003 Server Resource Kit) to install a service that will automatically start at boot time and execute the script. With newer versions this is unsupported, but still works.
One method that will not work is the definition of a startup script in the Local Group Policy! If the machine's trust relationship is broken then it will fail applying any Domain GPOs and stop the GPO application process completely. So even local policies will then not be applied.

Do you know other reliable methods for executing scripts at boot time in Windows machines? Or do you have any other remarks or questions? Then please comment!


This post first appeared on the VMware Front Experience Blog and was written by Andreas Peetz. Follow him on Twitter to keep up to date with what he posts.



6 comments:

  1. YOU ROCK. I was trying to find the secret bits to stop this from happening but hit the brick wall of Microsoft documentation.

    ReplyDelete
  2. Amazing post. Can't wait to get this going *marks it on project list*

    ReplyDelete
  3. I've been fixing this by removing the virtual PC from the domain, restarting, then adding it back to the domain.

    ReplyDelete
    Replies
    1. Sure, that's the long way to fix it. Takes two reboots.

      Delete
    2. A quicker trick is to remove the PC from the domain, then re-add it to the domain *before* you restart.

      Delete

***** All comments will be moderated! *****
- Please post only comments or questions that are related to this post's contents!
- Advertising and link spamming will not be tolerated!