IoT Custom Authorizer not respecting refreshAfterInSeconds or disconnectAfterInSeconds in returned policy
I created a GitHub project with scripts to automate the setup of the issue I'm facing via the aws cli
Here is the link to the project: https://github.com/TCROC/aws-iot-custom-auth
Here is a link to the zip hosted on GitHub: https://github.com/TCROC/aws-iot-custom-auth/releases/download/precompiled-arm64-lambda/aws-iot-auth-issues.zip.
It is somewhat tricky to cross compile for Linux ARM64 cpus from windows, so the .zip should be easier than building from source. Unless you are comfortable using Windows WSL.
And here is a copy of the README explaining the issue, the research I've done, and how to reproduce:
aws-iot-custom-auth
Dependencies
Tested on Ubuntu 22.04 and Windows 10.
Windows 10 requires WSL Ubuntu 22.04 for cross compiling to ARM64 processors.
- Install git: https://git-scm.com/downloads
- NOTE: Reproduced with version:
git version 2.40.1
- NOTE: Reproduced with version:
- Install the rust toolset: https://www.rust-lang.org/tools/install
- NOTE: Reproduced with version:
rustup 1.26.0 (5af9b9484 2023-04-05), cargo 1.69.0 (6e9a83356 2023-04-12), rustc 1.69.0 (84c898d65 2023-04-16)
- NOTE: Reproduced with version:
- Install cargo lambda: https://github.com/awslabs/aws-lambda-rust-runtime
- NOTE: Reproduced with version:
cargo-lambda 0.19.0 (e7a2b99 2023-04-07Z)
- NOTE: Reproduced with version:
- Install aws cli v2: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
- NOTE: Reproduced with version:
aws-cli/2.11.15 Python/3.11.3 Linux/6.2.6-76060206-generic exe/x86_64.pop.22 prompt/off
- NOTE: Reproduced with version:
- Install dotnet 7.0: https://dotnet.microsoft.com/en-us/download/dotnet/7.0
- NOTE: Reproduced with version:
7.0.203
- NOTE: Reproduced with version:
- Clone:
git clone https://github.com/TCROC/aws-iot-custom-auth.git --recurse-submodules
NOTE: When running the scripts, you can ignore the aws cli errors that are logged. The scripts do things such as check if the lambda function is deployed by calling aws lambda get-function
. If the command errors, the script assumes it doesn't exist in the cloud and attempts to create one.
Create Lambda Authorizer
Run in a bash shell:
./create-lambda.sh
Create certificate
Run in a bash shell:
./create-cert.sh
Test Lambda Authorizer
Run in a bash shell:
./run-client-lambda.sh
Expected result: The mqtt client sends keep alive packets for 24 hours as specified in the policy returned from the lambda function.
Actual result: The mqtt client is disconnected anywhere between 30 seconds and 5 minutes.
Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/config-custom-auth.html
You can test the response of the authorizer in the console: https://docs.aws.amazon.com/lambda/latest/dg/testing-functions.html
Example test event:
NOTE: The password is testpassword
base64 encoded
{ "token": "aToken", "signatureVerified": false, "protocols": [ "tls", "http", "mqtt" ], "protocolData": { "tls": { "serverName": "serverName" }, "http": { "headers": { "#{name}": "#{value}" }, "queryString": "?#{name}=#{value}" }, "mqtt": { "username": "test", "password": "dGVzdHBhc3N3b3Jk", "clientId": "testid" } }, "connectionMetadata": { "id": "UUID" } }
Example result:
{ "isAuthenticated": true, "principalId": "testid", "disconnectAfterInSeconds": 86400, "refreshAfterInSeconds": 86400, "policyDocuments": [ { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iot:Connect" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}" ], "Condition": { "ArnEquals": { "iot:LastWillTopic": [ "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ] } } }, { "Effect": "Allow", "Action": [ "iot:Receive" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/*" ], "Condition": {} }, { "Effect": "Allow", "Action": [ "iot:Publish" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ], "Condition": {} }, { "Effect": "Allow", "Action": [ "iot:Subscribe" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*" ], "Condition": {} } ] } ] }
Test Lambda Certificates
Run in a bash shell:
./run-client-cert.sh
Expected result: The mqtt authenticates and connects to IoT.
Actual Result: The client is immediately disconnected due to authorization error.
Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/x509-client-certs.html
Cleanup
The lambda functions, authorizers, and certificates in aws will be deleted.
Run in a bash shell:
./clean-aws.sh
Edit: The git clone command was incorrectly pointing to one of the submodules. I changed it to point to the correct repo.
Edit: After debugging with Microsoft on the mqttnet library, we had some great finds:
https://github.com/dotnet/MQTTnet/issues/1729
We found that there is a policy causing this issue: https://github.com/dotnet/MQTTnet/issues/1729#issuecomment-1537569554
✅ Policy that works
{ "isAuthenticated": true, "principalId": "testid", "disconnectAfterInSeconds": 86400, "refreshAfterInSeconds": 86400, "policyDocuments": [ { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iot:Connect" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}" ] }, { "Effect": "Allow", "Action": [ "iot:Receive" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/*" ] }, { "Effect": "Allow", "Action": [ "iot:Publish" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ] }, { "Effect": "Allow", "Action": [ "iot:Subscribe" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*" ] } ] } ] }
❌ Policy that disconnects randomly between 30 seconds and 5 minutes
{ "isAuthenticated": true, "principalId": "testid", "disconnectAfterInSeconds": 86400, "refreshAfterInSeconds": 86400, "policyDocuments": [ { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iot:Connect" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}" ], "Condition": { "ArnEquals": { "iot:LastWillTopic": [ "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ] } } }, { "Effect": "Allow", "Action": [ "iot:Receive" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/*" ] }, { "Effect": "Allow", "Action": [ "iot:Publish" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}", "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ] }, { "Effect": "Allow", "Action": [ "iot:Subscribe" ], "Resource": [ "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*", "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*" ] } ] } ] }
And when comparing the 2 policy documents to each other in VS Code, it shows that the offending policy is the condition requiring a LastWillTopic:
"Condition": { "ArnEquals": { "iot:LastWillTopic": [ "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" ] } }
My client does specify a LastWill topic in the builder:
var options = optionsBuilder .WithClientId(username) .WithWillTopic($"{rootTopic}/s/{username}") .WithWillRetain(false) .WithWillPayload(new byte[] { 0 }) .WithWillQualityOfServiceLevel(MqttQualityOfServiceLevel.AtLeastOnce) .WithProtocolVersion(MqttProtocolVersion.V500) .WithKeepAlivePeriod(TimeSpan.FromSeconds(35)) .WithoutPacketFragmentation() .Build();
And according to the AWS docs here, I should be able to restrict users to connecting with a specific last will specified right? The very last example provides an example of such a policy:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iot:Connect" ], "Resource": "arn:aws:iot:us-east-1:123456789012:client/client1", "Condition": { "ArnEquals": { "iot:LastWillTopic": "arn:aws:iot:region:account-id:topic/my/lastwill/topicName" } } }, { "Effect": "Deny", "Action": [ "iot:Connect" ], "Resource": "*", "Condition": { "ForAnyValue:StringEquals": { "iot:ConnectAttributes": [ "PersistentConnect" ] } } } ] }
Edit:
After looking at the docs, I noticed that the condition was a single key value pair instead of an array like mine. I updated my condition to be:
"Condition": { "ArnEquals": { "iot:LastWillTopic": "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" } }
But it still disconnects with an AUTHORIZATION_ERROR anywhere between 30 seconds to 5 minutes
- Newest
- Most votes
- Most comments
Relevant content
- asked 9 months agolg...
- asked 5 months agolg...
- Accepted Answerasked a year agolg...
- asked 7 months agolg...
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated 3 months ago
After debugging with Microsoft over here: https://github.com/dotnet/MQTTnet/issues/1729
We found that there is a policy causing this issue: https://github.com/dotnet/MQTTnet/issues/1729#issuecomment-1537569554
I would paste the entire comment in here, but repost won't let me post a comment that long. I'll try editing my original question.
Can you please post the solution you found as an answer here?
It isn't a solution. It is still an issue. Per this documentation, the above should be a valid policy: https://docs.aws.amazon.com/iot/latest/developerguide/connect-policy.html#persistent-sessions-examples