IoT Core: KafkaAction failed to send a message to the specified bootstrap servers. SSL handshake failed

1

I'm trying to create an IoT Core Rule that triggers a Kafka action to send our device data directly to our Kafka cluster, using Basic Ingest. I have successfully setup the VPC Destination that this action needs, specifying our subnets that contain the Kafka brokers; however, I am getting the following log in CloudWatch when the KafkaAction fails:

{
    "ruleName": "testKafkaAction",
    "topic": "",
    "cloudwatchTraceId": "<id>",
    "clientId": "test",
    "base64OriginalPayload": "<payload>",
    "failures": [
        {
            "failedAction": "KafkaAction",
            "failedResource": "iot-core-sensordata-stream",
            "errorMessage": "KafkaAction failed to send a message to the specified bootstrap servers. SSL handshake failed. Message arrived on: , Action: kafka, topic: test-kafka-action, bootstrap.servers: <bootstrap_servers>"
        }
    ]
}

Our Kafka cluster is self hosted, and uses a self-signed CA. I have created a binary secret in Secrets Manager containing the truststore in pkcs12 format. The truststore contains our self-signed CA. I am using SASL_SSL as the security protocol for the rule, and SCRAM-SHA-512 as the mechanism. The username and password are also stored as a separate secret in Secrets Manager. My rule's IAM policy is setup correctly to access these secrets. One other thing to note: My bootstrap servers list for the rule are the private ip:port of the nodes running the Kafka brokers in our VPC, and I have ensured these IPs are on the SAN list for the cert. Here is the json template for my rule as a reference:

{
    "sql": "SELECT *",
    "ruleDisabled": false,
    "awsIotSqlVersion": "2016-03-23",
    "actions": [
        {
            "kafka": {
                "destinationArn": "<VPC_DESTINATION_ARN>",
                "topic": "test-kafka-action",
                "clientProperties": {
                    "bootstrap.servers": "<KAFKA_BOOTSTRAP_SERVERS>",
                    "key.serializer": "org.apache.kafka.common.serialization.StringSerializer",
                    "value.serializer": "org.apache.kafka.common.serialization.ByteBufferSerializer",
                    "security.protocol": "SASL_SSL",
                    "ssl.truststore": "${get_secret('<SECRET_NAME>', 'SecretBinary', '<KAFKA_RULE_ROLE_ARN>')}",
                    "ssl.truststore.password": "{{SSL_TRUSTSTORE_PASSWORD}}",
                    "sasl.mechanism": "SCRAM-SHA-512",
                    "sasl.scram.username": "${get_secret('<SECRET_NAME>', 'SecretString', 'kafkaUser', '<KAFKA_RULE_ROLE_ARN>')}",
                    "sasl.scram.password": "${get_secret('<SECRET_NAME>', 'SecretString', 'kafkaPassword', '<KAFKA_RULE_ROLE_ARN>')}"
                }
            }
        }
    ],
    "errorAction": {
        "cloudwatchLogs": {
            "logGroupName": "AWSIotLogsV2",
            "<KAFKA_RULE_LOGS_ROLE_ARN>"
        }
    }
}

This error message does not provide any reason as to why the handshake failed, so my only guess is that IoT Core does not allow self-signed CAs for the KafkaAction in a rule. Is this true, or am I missing something elsewhere potentially? I am able to connect and publish a message to our Kafka cluster with no issues using a python client, giving it the same credentials and self-signed CA, after having to add our servers' IP addresses to the SAN list on the cert.

Otto45
gefragt vor 3 Jahren1018 Aufrufe
3 Antworten
2
Akzeptierte Antwort

I ended up getting things working with a slightly different configuration, so answering my own question to help out anyone. I am still using a self signed CA for our Kafka cluster, and am providing this to the KafkaAction in my IoT Core Rule via a PKCS12 formatted truststore in Secrets Manager, stored there as a binary secret (I have confirmed that using the old JKS format for the truststore works as well FYI). I was able to get things working by generating a X.509 cert and private key for the KafkaAction to use from the self signed CA, and switching the protocol to SSL. I created a keystore, also in PKCS12 format, containing this cert and private key, and likewise stored it in Secrets Manager as a binary secret. For those unfamiliar, you can only create binary secrets via the CLI or SDKs, it is impossible through the console. In my rule's configuration, I used the get_secret method as shown in the docs to instruct IoT Core to get the binaries from Secrets Manager. My IAM role for this rule also has a policy allowing access to these secrets. One last thing to note: My truststore and keystore both are password protected, so I also had to specify this password (the same for both in my case) in the configuration for the rule. I was never able to get things working with SASL and SCRAM-SHA-512, so my only guess is that this is unsupported with self signed CAs, or the documentation is confusing regarding what regions this is actually supported in.

Otto45
beantwortet vor 3 Jahren
profile picture
EXPERTE
überprüft vor 3 Monaten
0

Hi, I am facing the same problem. In addition to what is shared here, I also tried using SSL method and generated a key pair signed by the CA in my cluster. I also created a CA myself and created key pairs for brokers and IoT Core. But all of them ended up to the same error. @Otto45, did you folks solve the issue? If so, can you please share it here. Thanks!

acsgn
beantwortet vor 3 Jahren
  • I got things working, and I had to switch to SSL from SASL_SSL. I posted a detailed answer, hope it helps!

-1

Hello,

Thank you for reaching out to us. I understand that you were getting SSL handshake failed error while trying to create an IoT Core Rule that triggers a Kafka action to send our device data directly to our Kafka cluster, using Basic Ingest.

Could you please confirm the region which you are using. I am asking the region because you have mentioned that you are using SASL_SSL as the security protocol for the rule, and SCRAM-SHA-512 as the mechanism, SCRAM-SHA-512 is the only supported security mechanism in the cn-north-1, cn-northwest-1, us-gov-east-1, and us-gov-west-1 Regions. Please refer to this link [1] for more information on this security mechanism supported regions.

References:

  1. https://docs.aws.amazon.com/iot/latest/developerguide/apache-kafka-rule-action.html#apache-kafka-rule-action-parameters
AWS
SUPPORT-TECHNIKER
beantwortet vor 3 Jahren
  • I am using us-east-2, and that link says you have to use SCRAM-SHA-512 for SASL_SSL in those specific regions. The way I understand it I should be fine using SCRAM-SHA-512 in any region.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen