Debugging failed cross-account Transit Gateway VPC Attachments

0

I am in the process of trying to set up a Transit Gateway for use with a Client VPN to provide access to resources in some of the sub-accounts in my AWS organization. My approach is basically:

Account A (IT Services)

  • Creates and owns the Transit Gateway (TGW) in us-west-2
    • Auto-accept attachments enabled
    • Default association route table
    • Default propagation route table
  • Creates and owns a Client VPN (CVPN) (client CIDR 10.6.0.0/22) us-west-2
  • Has a 10.5.0.0/16 VPC with a single private subnet 10.5.0.0/22 in us-west-2a (with default ACL) that is used for connection between the TGW and CVPN
  • Uses RAM to share the TGW with Account B

Account B (Stage Infrastructure Network)

  • Has a 10.51.0.0/18 VPC, with 2 private subnets created, one in us-west-2a and one in 2b, specifically for TGW attachment
  • 10.51.63.0/28 and 10.51.63.16/28.
  • These subnets are using the default ACL

I am now trying to create a transit gateway VPC attachment in Account B to attach the shared gateway to the two subnets I've created. I'm using Terraform for this, and confirmed that the terraform agent is assuming a role with appropriate permissions to be able to perform this (effectively it has ec2:* allow * in this sub-account). The attachment is appearing in console as Pending (not Pending Acceptance because of auto-accept on the TGW), and it sits in Pending for about 10 minutes and then fails. There is no error message in the console, and I've been unable to find any relevant error logs in Cloudtrail for either Account A or Account B to understand why the attachment is failing to create.

I would greatly appreciate any input on what I may be doing wrong here with the network setup and/or pointers on things that I should investigate further to identify issues with my setup.

  • Strange you can't find anything on cloudtrail, can your search for the api call "CreateTransitGatewayVpcAttachment" . Also, you intend to enable the TGW attachment on two /28 subnet for account B, can you confirm there are available useable IP address within the subnet ?

  • @John I can see the CreateTransitGatewayVpcAttachment call in cloudfront logs, it has no error code and returns "state": "pending". I can confirm that there is available usable IP's in the subnets that I am trying to attach it to. See my response to George, I am able to create an attachment for the VPC manually, after which the terraform created attachments start to become available when re-created.

JBU
asked 2 months ago142 views
1 Answer
2
Accepted Answer

To add on to what John suggested, could you also check to confirm whether the role has permissions to create the AWSServiceRoleForVPCTransitGateway. Amazon VPC uses this service-linked role to perform certain actions when you create a TGW VPC attachment. Ideally the "ec2:*" permissions should be sufficient however you might want to verify this. A way to do this would be to check in Cloudtrail (in Account B) for the API call CreateServiceLinkedRole to see whether this is successful or not as well as to check in the same account whether this role exists. Alternatively, you could manually create the TGW VPC attachment (on the console) where if successful it means sufficient permissions are not provided when using Terraform.

profile pictureAWS
EXPERT
answered 2 months ago
profile picture
EXPERT
reviewed a month ago
  • I confirmed the Terraform role has the appropriate level of permissions, to the extent I can. Oddly I can't find any events in cloudtrail for CreateServiceLinkedRole, I'm assuming I must be querying that data incorrectly. I followed your advice to try manually creating the VPC attachment through the console, which succeeded. Immediately after, I re-ran the terraform apply on the plan that had failed and now the attachments it's creating are succeeding. I'm assuming the most logical explanation is the manual console create was enough to create the role, or some other implicit dependency, that Terraform is able to assume but not create itself?

  • Also of note, manually creating an attachment only seems to fix attaching for that one VPC. If I try to attach a different VPC, that one will fail as well until I manually create and delete an attachment, at which point future attachments to the VPC start to succeed. I see now also I've gotten a 'You recently requested an AWS service that required additional validation' email each time I manually added the VPC attachment, so I wonder if there is some hidden control on AWS' end that I'm hitting that going through the console sufficiently satisfies but going via the API with terraform does not.

  • This behavior would indicate that the permissions in use when using Terraform are not sufficient. I still suspect the CreateServiceLinkedRole is not created successfully when working with Terraform. When you manually create the attachment, the role gets created successfully and remains in the account even after you manually delete the attachment which is why future attachments are successful using Terraform. A way to confirm this would be to check in an VPC that you have not attempted to create the attachment whether the role exists after a failed attempt using Terraform versus when you manually create the attachment

  • After digging a bit more I identified a Service Control Policy that was causing some issues. After modifying that and, to be explicit, modifying the terraform role to explicitly allow CreateServiceLinkedRole for transitgateway the attachments are working correctly. Thank you for the help!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions