User Data on EC2 does not execute domain join script on Windows

0

I'm using an Auto Scaling Group to create Windows instances on Amazon EC2. In User Data, there is a script to add the machine to the domain, but the instance does not join the domain.

I have already validated that User Data is running, as there is a previous step in the script that installs PuTTY, and this is successful. However, the part of the script that is supposed to add the machine to Active Directory doesn't work.

To ensure that the instance was able to resolve the domain, I added a step in the script to configure the domain's DNS in the Network Adapter. Even so, the machine doesn't join the domain and there are no errors in the logs.

What I've tried: I checked the logs in C:\ProgramData\Amazon\EC2-Windows\Launch\Log\UserdataExecution.log and there are no error messages. Also: C:\Windows\system32\config\systemprofile\AppData\Local\Temp\Amazon\EC2-Windows\Launch\InvokeUserData\

I've confirmed that User Data is running correctly (the PuTTY installation proves this).

I added a step to configure the domain's DNS in the Network Adapter before attempting the domain join.

I've checked the permissions of the account used to join the domain.

I've tested running the command manually inside the instance, and it works.

I've also followed the recommendations in another post and created a custom image with Sysprep, but the problem still persists.

  • User Data Script might be running before the boot process, before the network stack or DNS client, adding

    Wait for network

    Start-Sleep -Seconds 30, might help if it is related to user data running too early.

4 Answers
2

Hello.

It may not be a direct solution, but how about joining AD using SSM documents as mentioned in the following blog?
I've used this method to set up automatic AD join in the past, and it worked fine.
I think it will probably work on the latest Windows Server OS as well.
https://aws.amazon.com/jp/blogs/security/how-to-configure-your-ec2-instances-to-automatically-join-a-microsoft-active-directory-domain/

profile picture
EXPERT
answered 2 months ago
1

It seems that you're encountering a common issue with joining EC2 instances to a domain using User Data scripts. While your User Data script is executing partially (as evidenced by the successful PuTTY installation), the domain join portion is failing silently. This could be due to several reasons:

  1. Timing issues: The domain join script might be executing before the network is fully initialized or before DNS settings are properly configured.

  2. Connectivity problems: There might be network connectivity issues between your EC2 instance and the domain controller.

  3. Security group or network ACL settings: Ensure that your security groups and network ACLs allow the necessary traffic for domain join operations.

  4. Incorrect credentials or permissions: The account used for domain join might not have the correct permissions.

To troubleshoot and resolve this issue, consider the following steps:

  1. Add a delay in your User Data script before attempting the domain join. This can help ensure that the network is fully initialized.

  2. Use the AWS Systems Manager Run Command instead of User Data for domain join operations. This method is more reliable and provides better logging and error reporting.

  3. Verify that your EC2 instance can resolve the domain name and communicate with the domain controller. You can use tools like nslookup and telnet to test connectivity.

  4. Double-check your security group and network ACL settings to ensure they allow traffic on the necessary ports for Active Directory communication (e.g., TCP/UDP 53, 88, 389, 445).

  5. Verify that the DNS settings on the EC2 instance are correct and pointing to the domain controller's IP addresses.

  6. Use the DirectoryServicePortTest tool to verify connectivity to the domain controller on the required ports.

  7. Consider using AWS Directory Service's seamless domain join feature, which can simplify the process and provide better reliability.

If the issue persists after trying these steps, you may need to implement more detailed logging in your User Data script to capture any errors that might be occurring during the domain join process.
Sources
cannot join remote windows 10 pro desktop to newly created aws simple ad domain | AWS re:Post
EC2 AD Join using AD Connector | AWS re:Post
Step 4: Test seamlessly joining an EC2 instance for Windows Server to a domain - AWS Directory Service
Issues when initializing AWS Directory Service | AWS re:Post

profile picture
answered 2 months ago
profile picture
EXPERT
reviewed 2 months ago
0
Accepted Answer

I was able to solve the issue thanks to @Riku_Kobayashi suggestion to use AWS Systems Manager — it really helped me get to the root cause.

When I tried using SSM, I encountered the following error: “The specified domain either does not exist or could not be contacted.” This made it clear that the instance was unable to reach the domain's DNS server.

Since I was launching the instance using an Auto Scaling Group module, I updated my Terraform code to configure the correct DNS servers during instance creation. That ensured the machine could communicate with the Active Directory domain controller.

After setting up the DNS properly, I added a simple script to join the domain, and it worked as expected. With the DNS configuration in place at launch time, the instance was able to connect to the AD server and complete the domain join successfully.

resource "aws_vpc_dhcp_options" "ad_dns" {
  domain_name               = var.ds_managed_ad_directory_name
  domain_name_servers = var.directory_dns

  tags = merge(
    local.tags,
    {
      Name = "dns-${var.project}-${var.scope}-${var.environment}"
    }
  )
}

resource "aws_vpc_dhcp_options_association" "dns_assoc" {
  vpc_id                   = data.aws_ssm_parameter.directory_services_network_id.value
  dhcp_options_id = aws_vpc_dhcp_options.ad_dns.id
}
profile picture
answered a month ago
0

Before you test the userdata, can you launch a Windows EC2, and then run user data script from within the EC2 and see if that works. That way, you could confirm if the userdata script is working as expected.

In userdata, make sure you restart the computer for the changes to take effect. Here's an example.

<powershell>
C:\windows\userdata-scripts\JoinDomain.ps1
Restart-Computer
</powershell>
answered 2 months ago
  • Yes, it works. I added a step to rename the machine and after the restart the script continues, but it doesn't continue, when I use this script inside the machine, it works. And it needs to work during the creation of the instance. I don't understand why it doesn't work.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions