IoT Core: KafkaAction failed to send a message to the specified bootstrap servers. SSL handshake failed

1

I'm trying to create an IoT Core Rule that triggers a Kafka action to send our device data directly to our Kafka cluster, using Basic Ingest. I have successfully setup the VPC Destination that this action needs, specifying our subnets that contain the Kafka brokers; however, I am getting the following log in CloudWatch when the KafkaAction fails:

{
    "ruleName": "testKafkaAction",
    "topic": "",
    "cloudwatchTraceId": "<id>",
    "clientId": "test",
    "base64OriginalPayload": "<payload>",
    "failures": [
        {
            "failedAction": "KafkaAction",
            "failedResource": "iot-core-sensordata-stream",
            "errorMessage": "KafkaAction failed to send a message to the specified bootstrap servers. SSL handshake failed. Message arrived on: , Action: kafka, topic: test-kafka-action, bootstrap.servers: <bootstrap_servers>"
        }
    ]
}

Our Kafka cluster is self hosted, and uses a self-signed CA. I have created a binary secret in Secrets Manager containing the truststore in pkcs12 format. The truststore contains our self-signed CA. I am using SASL_SSL as the security protocol for the rule, and SCRAM-SHA-512 as the mechanism. The username and password are also stored as a separate secret in Secrets Manager. My rule's IAM policy is setup correctly to access these secrets. One other thing to note: My bootstrap servers list for the rule are the private ip:port of the nodes running the Kafka brokers in our VPC, and I have ensured these IPs are on the SAN list for the cert. Here is the json template for my rule as a reference:

{
    "sql": "SELECT *",
    "ruleDisabled": false,
    "awsIotSqlVersion": "2016-03-23",
    "actions": [
        {
            "kafka": {
                "destinationArn": "<VPC_DESTINATION_ARN>",
                "topic": "test-kafka-action",
                "clientProperties": {
                    "bootstrap.servers": "<KAFKA_BOOTSTRAP_SERVERS>",
                    "key.serializer": "org.apache.kafka.common.serialization.StringSerializer",
                    "value.serializer": "org.apache.kafka.common.serialization.ByteBufferSerializer",
                    "security.protocol": "SASL_SSL",
                    "ssl.truststore": "${get_secret('<SECRET_NAME>', 'SecretBinary', '<KAFKA_RULE_ROLE_ARN>')}",
                    "ssl.truststore.password": "{{SSL_TRUSTSTORE_PASSWORD}}",
                    "sasl.mechanism": "SCRAM-SHA-512",
                    "sasl.scram.username": "${get_secret('<SECRET_NAME>', 'SecretString', 'kafkaUser', '<KAFKA_RULE_ROLE_ARN>')}",
                    "sasl.scram.password": "${get_secret('<SECRET_NAME>', 'SecretString', 'kafkaPassword', '<KAFKA_RULE_ROLE_ARN>')}"
                }
            }
        }
    ],
    "errorAction": {
        "cloudwatchLogs": {
            "logGroupName": "AWSIotLogsV2",
            "<KAFKA_RULE_LOGS_ROLE_ARN>"
        }
    }
}

This error message does not provide any reason as to why the handshake failed, so my only guess is that IoT Core does not allow self-signed CAs for the KafkaAction in a rule. Is this true, or am I missing something elsewhere potentially? I am able to connect and publish a message to our Kafka cluster with no issues using a python client, giving it the same credentials and self-signed CA, after having to add our servers' IP addresses to the SAN list on the cert.

Otto45
asked 3 years ago983 views
3 Answers
2
Accepted Answer

I ended up getting things working with a slightly different configuration, so answering my own question to help out anyone. I am still using a self signed CA for our Kafka cluster, and am providing this to the KafkaAction in my IoT Core Rule via a PKCS12 formatted truststore in Secrets Manager, stored there as a binary secret (I have confirmed that using the old JKS format for the truststore works as well FYI). I was able to get things working by generating a X.509 cert and private key for the KafkaAction to use from the self signed CA, and switching the protocol to SSL. I created a keystore, also in PKCS12 format, containing this cert and private key, and likewise stored it in Secrets Manager as a binary secret. For those unfamiliar, you can only create binary secrets via the CLI or SDKs, it is impossible through the console. In my rule's configuration, I used the get_secret method as shown in the docs to instruct IoT Core to get the binaries from Secrets Manager. My IAM role for this rule also has a policy allowing access to these secrets. One last thing to note: My truststore and keystore both are password protected, so I also had to specify this password (the same for both in my case) in the configuration for the rule. I was never able to get things working with SASL and SCRAM-SHA-512, so my only guess is that this is unsupported with self signed CAs, or the documentation is confusing regarding what regions this is actually supported in.

Otto45
answered 3 years ago
profile picture
EXPERT
reviewed 2 months ago
0

Hi, I am facing the same problem. In addition to what is shared here, I also tried using SSL method and generated a key pair signed by the CA in my cluster. I also created a CA myself and created key pairs for brokers and IoT Core. But all of them ended up to the same error. @Otto45, did you folks solve the issue? If so, can you please share it here. Thanks!

acsgn
answered 3 years ago
  • I got things working, and I had to switch to SSL from SASL_SSL. I posted a detailed answer, hope it helps!

-1

Hello,

Thank you for reaching out to us. I understand that you were getting SSL handshake failed error while trying to create an IoT Core Rule that triggers a Kafka action to send our device data directly to our Kafka cluster, using Basic Ingest.

Could you please confirm the region which you are using. I am asking the region because you have mentioned that you are using SASL_SSL as the security protocol for the rule, and SCRAM-SHA-512 as the mechanism, SCRAM-SHA-512 is the only supported security mechanism in the cn-north-1, cn-northwest-1, us-gov-east-1, and us-gov-west-1 Regions. Please refer to this link [1] for more information on this security mechanism supported regions.

References:

  1. https://docs.aws.amazon.com/iot/latest/developerguide/apache-kafka-rule-action.html#apache-kafka-rule-action-parameters
AWS
SUPPORT ENGINEER
answered 3 years ago
  • I am using us-east-2, and that link says you have to use SCRAM-SHA-512 for SASL_SSL in those specific regions. The way I understand it I should be fine using SCRAM-SHA-512 in any region.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions