IoT Custom Authorizer not respecting refreshAfterInSeconds or disconnectAfterInSeconds in returned policy

0

I created a GitHub project with scripts to automate the setup of the issue I'm facing via the aws cli

Here is the link to the project: https://github.com/TCROC/aws-iot-custom-auth

Here is a link to the zip hosted on GitHub: https://github.com/TCROC/aws-iot-custom-auth/releases/download/precompiled-arm64-lambda/aws-iot-auth-issues.zip.

It is somewhat tricky to cross compile for Linux ARM64 cpus from windows, so the .zip should be easier than building from source.  Unless you are comfortable using Windows WSL.

And here is a copy of the README explaining the issue, the research I've done, and how to reproduce:

aws-iot-custom-auth

Dependencies

Tested on Ubuntu 22.04 and Windows 10.

Windows 10 requires WSL Ubuntu 22.04 for cross compiling to ARM64 processors.

  1. Install git: https://git-scm.com/downloads
    • NOTE: Reproduced with version: git version 2.40.1
  2. Install the rust toolset: https://www.rust-lang.org/tools/install
    • NOTE: Reproduced with version: rustup 1.26.0 (5af9b9484 2023-04-05), cargo 1.69.0 (6e9a83356 2023-04-12), rustc 1.69.0 (84c898d65 2023-04-16)
  3. Install cargo lambda: https://github.com/awslabs/aws-lambda-rust-runtime
    • NOTE: Reproduced with version: cargo-lambda 0.19.0 (e7a2b99 2023-04-07Z)
  4. Install aws cli v2: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
    • NOTE: Reproduced with version: aws-cli/2.11.15 Python/3.11.3 Linux/6.2.6-76060206-generic exe/x86_64.pop.22 prompt/off
  5. Install dotnet 7.0: https://dotnet.microsoft.com/en-us/download/dotnet/7.0
    • NOTE: Reproduced with version: 7.0.203
  6. Clone:
    git clone https://github.com/TCROC/aws-iot-custom-auth.git --recurse-submodules
    

NOTE: When running the scripts, you can ignore the aws cli errors that are logged. The scripts do things such as check if the lambda function is deployed by calling aws lambda get-function . If the command errors, the script assumes it doesn't exist in the cloud and attempts to create one.

Create Lambda Authorizer

Run in a bash shell:

./create-lambda.sh

Create certificate

Run in a bash shell:

./create-cert.sh

Test Lambda Authorizer

Run in a bash shell:

./run-client-lambda.sh

Expected result: The mqtt client sends keep alive packets for 24 hours as specified in the policy returned from the lambda function.

Actual result: The mqtt client is disconnected anywhere between 30 seconds and 5 minutes.

Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/config-custom-auth.html

You can test the response of the authorizer in the console: https://docs.aws.amazon.com/lambda/latest/dg/testing-functions.html

Example test event:

NOTE: The password is testpassword base64 encoded

{
    "token": "aToken",
    "signatureVerified": false,
    "protocols": [
        "tls",
        "http",
        "mqtt"
    ],
    "protocolData": {
        "tls": {
            "serverName": "serverName"
        },
        "http": {
            "headers": {
                "#{name}": "#{value}"
            },
            "queryString": "?#{name}=#{value}"
        },
        "mqtt": {
            "username": "test",
            "password": "dGVzdHBhc3N3b3Jk",
            "clientId": "testid"
        }
    },
    "connectionMetadata": {
        "id": "UUID"
    }
}

Example result:

{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ],
          "Condition": {
            "ArnEquals": {
              "iot:LastWillTopic": [
                "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
              ]
            }
          }
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ],
          "Condition": {}
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ],
          "Condition": {}
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ],
          "Condition": {}
        }
      ]
    }
  ]
}

Test Lambda Certificates

Run in a bash shell:

./run-client-cert.sh

Expected result: The mqtt authenticates and connects to IoT.

Actual Result: The client is immediately disconnected due to authorization error.

Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/x509-client-certs.html

Cleanup

The lambda functions, authorizers, and certificates in aws will be deleted.

Run in a bash shell:

./clean-aws.sh

Edit: The git clone command was incorrectly pointing to one of the submodules. I changed it to point to the correct repo.


Edit: After debugging with Microsoft on the mqttnet library, we had some great finds:

https://github.com/dotnet/MQTTnet/issues/1729

We found that there is a policy causing this issue: https://github.com/dotnet/MQTTnet/issues/1729#issuecomment-1537569554

✅ Policy that works

{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ]
        }
      ]
    }
  ]
}

❌ Policy that disconnects randomly between 30 seconds and 5 minutes

{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ],
          "Condition": {
            "ArnEquals": {
              "iot:LastWillTopic": [
                "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
              ]
            }
          }
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ]
        }
      ]
    }
  ]
}

And when comparing the 2 policy documents to each other in VS Code, it shows that the offending policy is the condition requiring a LastWillTopic:

"Condition": {
  "ArnEquals": {
    "iot:LastWillTopic": [
      "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
    ]
  }
}

My client does specify a LastWill topic in the builder:

var options = optionsBuilder
    .WithClientId(username)
    .WithWillTopic($"{rootTopic}/s/{username}")
    .WithWillRetain(false)
    .WithWillPayload(new byte[] { 0 })
    .WithWillQualityOfServiceLevel(MqttQualityOfServiceLevel.AtLeastOnce)
    .WithProtocolVersion(MqttProtocolVersion.V500)
    .WithKeepAlivePeriod(TimeSpan.FromSeconds(35))
    .WithoutPacketFragmentation()
    .Build();

And according to the AWS docs here, I should be able to restrict users to connecting with a specific last will specified right? The very last example provides an example of such a policy:

https://docs.aws.amazon.com/iot/latest/developerguide/connect-policy.html#persistent-sessions-examples

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "iot:Connect"
            ],
            "Resource": "arn:aws:iot:us-east-1:123456789012:client/client1",
            "Condition": {
                "ArnEquals": {
                    "iot:LastWillTopic": "arn:aws:iot:region:account-id:topic/my/lastwill/topicName"
                }
            }
        },
        {
            "Effect": "Deny",
            "Action": [
                "iot:Connect"
            ],
            "Resource": "*",
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "iot:ConnectAttributes": [
                        "PersistentConnect"
                    ]
                }
            }
        }
    ]
}

Edit:

After looking at the docs, I noticed that the condition was a single key value pair instead of an array like mine. I updated my condition to be:

          "Condition": {
            "ArnEquals": {
              "iot:LastWillTopic": "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
            }
          }

But it still disconnects with an AUTHORIZATION_ERROR anywhere between 30 seconds to 5 minutes

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions