Skip to content

Experiencing “The security token included in the request is expired” error with AWS Lambda, AssumeRole and SES

0

Hello AWS Community,

I'm currently working with AWS Lambda for a batch process that sends emails. As part of this process, I've implemented a temporary AssumeRole, which has a validity of 15 minutes. I've stored the session of this AssumeRole in an Elastic Cache for around 13 minutes.

Here's the relevant part of my code:

const assumeRole = {
      RoleArn: roleArn,
      RoleSessionName: `${randomName}`,
      DurationSeconds: 900, // 900s
    };
const { Credentials } = await stsClient.send(
      new AssumeRoleCommand(assumeRole),
    );
await cache.hset(__AWSCREDENTIAL__, /** Some of Credentials store here **/)
await cache.expire(__AWSCREDENTIAL__, 780, 'NX') // 780s

const ses = new aws.SES({/** Credentials & Regions **/})
return createTransport({ SES: { ses, aws } }); // This is createTransport of nodemailer <<< Will be reused if the cache is still existed.

The idea is to use the cached transport if it's available; otherwise, a new transport is created. However, during testing, I occasionally encounter the following error.

The security token included in the request is expired

I currently use 2 separate batch, each having their own lambda function Role ARN, but using the same source code, same Role ARN, so the first one may use the second one sessionToken. But the error is so randomly, the 1st batch doesn't fail a single case but the second one randomly failed (Sometime it failed in a round for 10 minutes)

I'm looking for any insights or advice on how to address this issue. Any help would be greatly appreciated!

Thank you in advance.

asked 2 years ago1.7K views
2 Answers
1
Accepted Answer

I just found out the problem here. It's about I managed the "transport" variable. When there are at least 2 processes is running, both of them are keeping alive, if one of them has been trigger first, and change the sessionToken, the second one have no idea if the new sessionToken is belong to him or not. So it keeps using the old tranport, wich leads to the problem.

answered 2 years ago
EXPERT
reviewed 2 years ago
0

Well, it sounds like you're fetching a session at time index 0 seconds that's valid until time index 900 seconds. Then you're putting it in the cache and allowing it to be fetched until time index 780 seconds. If you fetched the session from the cache at time index 700, for example, it would only have 200 seconds left and not the >10 minutes your code needs to run.

Why are you doing the AssumeRole at all? Why not simply use the Lambda function's execution role to send the emails? Lambda will take care of obtaining those credentials without any calls being made to AssumeRole.

EXPERT
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.