I want to troubleshoot a failed AWS Glue job that connects to an Amazon Managed Streaming for Apache Kafka (MSK) cluster across AWS accounts.
Resolution
Check that the AWS Glue job can connect to the Amazon MSK cluster, and then troubleshoot the AWS Glue job's authentication method.
Check the cross-account AWS Glue job's connectivity
To verify that the AWS Glue job can connect to the Amazon MSK cluster, complete the following steps:
- Verify that the AWS Glue connection's network access control list (network ACL) allows traffic to the Amazon MSK cluster in the cross-account Amazon Virtual Private Cloud (Amazon VPC).
- Confirm that the Amazon MSK cluster's security group allows the AWS Glue connection's subnet CIDR on the Amazon MSK cluster's bootstrap server ports.
Note: The AWS Glue connection's security groups must contain a self-referencing inbound rule for the necessary TCP ports.
- Check that you correctly configured your VPC peering connections between the Amazon MSK cluster and the AWS Glue connection's VPCs or subnets.
- Use Reachability Analyzer to check whether a component interferes with connectivity between the VPCs.
- Launch an Amazon Elastic Compute Cloud (Amazon EC2) instance in the same subnet and security group that the AWS Glue connection uses.
Use Session Manager, a capability of AWS Systems Manager, or an SSH client to log in to your EC2 instance. Then, run the following tests:
telnet example-bootstrap-server-hostname example-bootstrap-server-port
nc -zv example-bootstrap-server-hostname example-bootstrap-server-port
dig example-bootstrap-server-hostname
Note: In the preceding commands, replace the example values with your values. If telnet isn't installed, then run sudo yum install telnet -y to install it.
If the output includes connected or connections established, then the AWS Glue job's connectivity is verified.
Troubleshoot authentication issues
To verify the bootstrap server URL in the AWS Glue connection, complete the following steps:
- Get the bootstrap brokers from Amazon MSK.
- Open the AWS Glue console.
- In the navigation pane, under Data Catalog, choose Connections. You can also choose Data connections in the navigation pane.
- In Connections, select your connection, and then choose Actions.
- In the dropdown list, choose Edit.
- Under Connection access, check that the Kafka bootstrap server URLs match the URLs in the Amazon MSK console.
- If the URLs don't match, then update them according to the authentication method that the Amazon MSK cluster uses. Use the port numbers that match your broker configuration:
For TLS/SSL, use port 9094 for access within AWS and port 9194 for public access.
For SASL/SCRAM, use port 9096 for access within AWS and port 9196 for public access.
- Choose Save changes.
Then, take the following actions based on the Amazon MSK cluster's authentication method.
SASL/SCRAM-SHA-512
Use AWS Secrets Manager to verify your username and password. If you use Secrets Manager to store your credentials, then verify that the AWS Glue connection's subnet can reach your Secret Manager's endpoint.
TLS/SSL client authentication
To validate the Kafka client's keystore certificate and keystore password or key password, run the following command:
keytool -list -v -keystore /pathtocert/kafka.client.keystore.jks -storepass 123456
Check that the output contains the AWS Private Certificate Authority (AWS Private CA) certificate that the Amazon MSK cluster uses.
If the output doesn't contain the certificate, then complete steps 5-11 of Set up a client to use authentication to create new keystores.
Important: Create new keystores for every client with the same private certificate that the Amazon MSK cluster uses.
Upload the kafka.client.keystore.jks certificate to Amazon Simple Storage Service (Amazon S3). Then, configure your AWS Glue Kafka connection with the certificate's S3 path.
IAM authentication
Verify that the AWS Glue job's AWS Identity and Access Management (IAM) role has the correct authorization policy for the Amazon MSK cluster.
Related information
Creating a Kafka connection
Streaming ETL jobs in AWS Glue