How can I use Step Functions to stop an Amazon RDS instance for longer than 7 days?

10 minute read
0

I want to use AWS Step Functions to stop an Amazon Relational Database Service (Amazon RDS) for longer than the 7-day duration.

Short description

By default, you can stop an Amazon RDS database instance for up to seven days at a time. After seven days, the instance restarts so that it doesn't miss any maintenance updates.

To stop your instance for more than 7 days, you can use Step Functions to automate the workflow without missing a maintenance window.

Note: For an alternative resolution, see How can I use an AWS Lambda function to stop an Amazon RDS instance for longer than seven days?

Resolution

Configure IAM permissions

Create an AWS Identity and Access Management (IAM) policy that allows Step Functions to start and stop an instance and retrieve information on the instance:

1.    Open the IAM console.

2.    In the navigation pane, choose Policies. Then, choose Create Policy.

3.    Choose the JSON tab. Then, enter the following policy to grant the required IAM permissions:

{
    "Version": "2012-10-17",
    "Statement":
    [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBInstances",
                "rds:StopDBInstance",
                "rds:StartDBInstance"
            ],
            "Resource": "*"
        }
    ]
}

4.    Choose Next: Tags.

5.    (Optional) To add a tag, choose Add tag, and then enter the appropriate values for the Key and Value fields.

6.    Choose Next: Review.

7.    Enter the name for your policy. For example, enter step-functions-start-stop-rds-policy. To review your policy's granted permissions, see the Summary section.

8.    Choose Create policy.

For more information, see Creating policies using the JSON editor.

Create an IAM role and attach the required policies

1.    Open the IAM console.

2.    In the navigation pane, choose Roles. Then, choose Create role.

3.    For Select type of trusted entity, choose AWS service.

4.    In the Use cases for other AWS service dropdown list, choose Step Functions. Then, choose the Step Functions option.

5.    Choose Next.
Note: Don't take any actions on the Add permissions page. Create the role first, and then edit the default permissions.

6.    Choose Next.

7.    For Role name, enter the name for the role. For example, enter step-functions-start-stop-rds-role.
(Optional) Update the role description.
(Optional) To add a tag, enter the appropriate values for the Key and Value fields.

8.    Choose Create role. This returns you to the Roles list.

9.    In the search box, enter the name of the role that you created. Then, select that role to see its details.

10.    In the Permissions tab, choose the Add Permissions dropdown list. Then, choose Attach policies.

11.    Enter the name of the policy that you created in the Configure IAM permissions section. For example, enter step-functions-start-stop-rds-policy. When you see this policy as an option, select it.

12.    In the Permissions tab, select the AWSLambdaRole AWS managed policy, and then choose Remove.

For more information, see Creating a role for an AWS service (console).

Add tags for DB instances

1.    Open the Amazon RDS console.

2     In the navigation pane, choose Databases.

3.    Select the DB instance that you want to start and stop automatically.

4.    Choose the Tags tab.

5.    Choose Add. For Tag key, enter autostart. For Value, enter yes.

6.    Choose Add another Tag. For Tag key, enter autostop. For Value, enter yes.

7.    To save these tags, choose Add.

For more information, see Adding, listing, and removing tags.

Create a state machine to start and stop the tagged DB instances

1.    Open the Step Functions console.

2.    In the navigation pane, choose State machines. Then, choose Create state machine.

3.    Choose Write your workflow in code.

4.    Keep the default Type as Standard.

5.    In the Definition editor, delete the sample JSON definition. Then, enter the following state machine definition:

{
  "Comment": "State Machine Definition to start and stop RDS DB instances",
  "StartAt": "Describe DBInstances to Start",
  "States": {
    "Describe DBInstances to Start": {
      "Type": "Task",
      "Parameters": {},
      "Resource": "arn:aws:states:::aws-sdk:rds:describeDBInstances",
      "Next": "Iterate on Instances to Start",
      "Retry": [
        {
          "ErrorEquals": [
            "Rds.InternalFailure",
            "Rds.ServiceUnavailable",
            "Rds.ThrottlingException",
            "Rds.SdkClientException"
          ],
          "BackoffRate": 2,
          "IntervalSeconds": 1,
          "MaxAttempts": 2
        }
      ]
    },
    "Iterate on Instances to Start": {
      "Type": "Map",
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "INLINE"
        },
        "StartAt": "Format Array before Start",
        "States": {
          "Format Array before Start": {
            "Type": "Pass",
            "Next": "Check If Instance stopped, if no Tags or if Tags contains 'autostart=yes'",
            "Parameters": {
              "DbInstanceStatus.$": "$.DBInstance.DbInstanceStatus",
              "DbInstanceIdentifier.$": "$.DBInstance.DbInstanceIdentifier",
              "TagList.$": "$.DBInstance.TagList",
              "TagsArrayLength.$": "States.ArrayLength($.DBInstance.TagList)",
              "TagContainsKey.$": "States.ArrayContains($.DBInstance.TagList,$.LookingFor)"
            }
          },
          "Check If Instance stopped, if no Tags or if Tags contains 'autostart=yes'": {
            "Type": "Choice",
            "Choices": [
              {
                "Not": {
                  "Variable": "$.DbInstanceStatus",
                  "StringEquals": "stopped"
                },
                "Next": "Instance is not in 'stopped' status"
              },
              {
                "Variable": "$.TagsArrayLength",
                "NumericEquals": 0,
                "Next": "No Tags found to Start"
              },
              {
                "Variable": "$.TagContainsKey",
                "BooleanEquals": true,
                "Next": "Tags found Start DBInstance"
              }
            ],
            "Default": "No Tags found to Start"
          },
          "Tags found Start DBInstance": {
            "Type": "Task",
            "Parameters": {
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            },
            "Resource": "arn:aws:states:::aws-sdk:rds:startDBInstance",
            "Retry": [
              {
                "ErrorEquals": [
                  "Rds.InternalFailure",
                  "Rds.ServiceUnavailable",
                  "Rds.ThrottlingException",
                  "Rds.SdkClientException"
                ],
                "BackoffRate": 2,
                "IntervalSeconds": 1,
                "MaxAttempts": 2
              }
            ],
            "Catch": [
              {
                "ErrorEquals": [
                  "States.ALL"
                ],
                "Next": "Failed to Start DBInstance"
              }
            ],
            "ResultSelector": {
              "message": "Instance Started",
              "DbInstanceIdentifier.$": "$.DbInstance.DbInstanceIdentifier"
            },
            "End": true
          },
          "Failed to Start DBInstance": {
            "Type": "Pass",
            "Parameters": {
              "message": "Failed to start instance",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            },
            "End": true
          },
          "No Tags found to Start": {
            "Type": "Pass",
            "End": true,
            "Parameters": {
              "message": "No Tags found to Start",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            }
          },
          "Instance is not in 'stopped' status": {
            "Type": "Pass",
            "End": true,
            "Parameters": {
              "message": "Instance is not in 'stopped' status",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            }
          }
        }
      },
      "InputPath": "$.DbInstances",
      "Next": "Wait for 1 hour and 30 minutes",
      "ItemSelector": {
        "LookingFor": {
          "Key": "autostart",
          "Value": "yes"
        },
        "DBInstance.$": "$$.Map.Item.Value"
      }
    },
    "Wait for 1 hour and 30 minutes": {
      "Type": "Wait",
      "Seconds": 5400,
      "Next": "Describe DBInstances to Stop"
    },
    "Describe DBInstances to Stop": {
      "Type": "Task",
      "Parameters": {},
      "Resource": "arn:aws:states:::aws-sdk:rds:describeDBInstances",
      "Retry": [
        {
          "ErrorEquals": [
            "Rds.InternalFailure",
            "Rds.ServiceUnavailable",
            "Rds.ThrottlingException",
            "Rds.SdkClientException"
          ],
          "BackoffRate": 2,
          "IntervalSeconds": 1,
          "MaxAttempts": 2
        }
      ],
      "Next": "Iterate on Instances to Stop"
    },
    "Iterate on Instances to Stop": {
      "Type": "Map",
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "INLINE"
        },
        "StartAt": "Format Array before Stop",
        "States": {
          "Format Array before Stop": {
            "Type": "Pass",
            "Next": "Check If Instance available, if no Tags or if Tags contains 'autostop=yes'",
            "Parameters": {
              "DbInstanceStatus.$": "$.DBInstance.DbInstanceStatus",
              "DbInstanceIdentifier.$": "$.DBInstance.DbInstanceIdentifier",
              "TagList.$": "$.DBInstance.TagList",
              "TagsArrayLength.$": "States.ArrayLength($.DBInstance.TagList)",
              "TagContainsKey.$": "States.ArrayContains($.DBInstance.TagList,$.LookingFor)"
            }
          },
          "Check If Instance available, if no Tags or if Tags contains 'autostop=yes'": {
            "Type": "Choice",
            "Choices": [
              {
                "Not": {
                  "Variable": "$.DbInstanceStatus",
                  "StringEquals": "available"
                },
                "Next": "Instance is not in 'available' status"
              },
              {
                "Variable": "$.TagsArrayLength",
                "NumericEquals": 0,
                "Next": "No Tags found to Stop"
              },
              {
                "Variable": "$.TagContainsKey",
                "BooleanEquals": true,
                "Next": "Tags found Stop DBInstance"
              }
            ],
            "Default": "No Tags found to Stop"
          },
          "Tags found Stop DBInstance": {
            "Type": "Task",
            "Parameters": {
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            },
            "Resource": "arn:aws:states:::aws-sdk:rds:stopDBInstance",
            "Retry": [
              {
                "ErrorEquals": [
                  "Rds.InternalFailure",
                  "Rds.ServiceUnavailable",
                  "Rds.ThrottlingException",
                  "Rds.SdkClientException"
                ],
                "BackoffRate": 2,
                "IntervalSeconds": 1,
                "MaxAttempts": 2
              }
            ],
            "Catch": [
              {
                "ErrorEquals": [
                  "States.ALL"
                ],
                "Next": "Failed to Stop DBInstance"
              }
            ],
            "ResultSelector": {
              "message": "Instance Stopped",
              "DbInstanceIdentifier.$": "$.DbInstance.DbInstanceIdentifier"
            },
            "End": true
          },
          "Failed to Stop DBInstance": {
            "Type": "Pass",
            "Parameters": {
              "message": "Failed to stop instance",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            },
            "End": true
          },
          "No Tags found to Stop": {
            "Type": "Pass",
            "End": true,
            "Parameters": {
              "message": "No Tags found to Stop",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            }
          },
          "Instance is not in 'available' status": {
            "Type": "Pass",
            "End": true,
            "Parameters": {
              "message": "Instance is not in 'available' status",
              "DbInstanceIdentifier.$": "$.DbInstanceIdentifier"
            }
          }
        }
      },
      "InputPath": "$.DbInstances",
      "Next": "Workflow Finished",
      "ItemSelector": {
        "LookingFor": {
          "Key": "autostop",
          "Value": "yes"
        },
        "DBInstance.$": "$$.Map.Item.Value"
      }
    },
    "Workflow Finished": {
      "Type": "Succeed"
    }
  }
}

Note: For testing purposes, you can modify the Seconds field in the Wait for 1 hour and 30 minutes state. Also, you can extend this value for a longer or shorter maintenance window.

6.    Choose Next.

7.    Enter a state machine name. For example, enter step-functions-start-stop-rds-state-machine.

8.    Under Permissions, choose Choose an existing role. Then, select the IAM role that you created. For example, select step-functions-start-stop-rds-role.

9.    (Optional) To add a tag, enter the appropriate values for the Key and Value fields.

10.  Choose Create state machine.

Perform a workflow test

To perform function testing for tagged DB instances that are in the Stopped state, complete the following steps:

1.    Open the Step Functions console.

2.    In the navigation pane, choose State machines.

3.    Select the state machine that you created to start your DB instances.

4.    Choose Start execution.
Note: This resolution doesn't require an event payload. Under the Start execution dialog, you can remove the event payload or leave the default event.

5.    Choose Start execution.

Create the Schedule

To schedule a weekly maintenance window for the tagged DB instances, create an EventBridge rule. This rule automatically starts the DB instance 30 minutes before the maintenance window.

In the following example use case, the maintenance window occurs during 22:00–22:30 on Sunday. The example rule starts the DB instance at 21:30 every Sunday.

1.    Open the EventBridge console.

2.    Select the state machine that you previously created in the section Create a state machine to start and stop the tagged DB instances.

3.    Under Buses, select Rules.

4.    Choose the default event bus from the dropdown list.

5.    Select Create rule.

6.    For Rule name, enter the name of the rule that you want to create. For example, enter step-functions-start-stop-rds-rule.

7.    For Rule type, choose Schedule. For other values on this page, keep their default settings.

8.    Choose Continue with EventBridge Scheduler.

9.    For Recurring schedule, choose Occurrence. Then, for Schedule type, choose Cron-based schedule.

10.    Add a cron expression for the automated schedule. For example, enter cron(30 21 ? * SUN *).

11.    For the Flexible time window, choose Off.

12.    Choose Next.

13.    Choose Frequently used APIs, and then choose StartExecution.

14.    Under StartExecution, select the state machine that you created. For example, select step-functions-start-stop-rds-state-machine. Keep the input's default value of {}.

15.    Choose Next.

16.    Under Retry policy and dead-letter queue (DLQ), clear the Retry policy to deactivate it.

17.    Under Permissions, keep the default option: Create new role for this schedule.

Note: EventBridge creates a new role to assume and initiates the StartExecution API for your particular workflow.

18.    Choose Next.

19.    Under Schedule detail, verify that the next 10 invocation dates match the dates of your expected schedule.

20.    Choose Create schedule.

When EventBridge invokes the rule, it begins the Step Function workflow. This starts the DB instances 30 minutes before the maintenance window. The workflow then stops the DB instance 30 minutes after the maintenance window completes.  For example, the workflow starts your DB instance at 21:30. The maintenance window occurs at 22:00–22:30. Then, the workflow stops your instance at 23:00.