How do I set up cross-account streaming from Kinesis Data Firehose to Amazon OpenSearch Service?

8 minute read
0

I want to set up an Amazon Kinesis Data Firehose stream that sends data to an Amazon OpenSearch Service cluster in another account.

Short description

Set up Kinesis Data Firehose and its dependencies, such as Amazon Simple Storage Service (Amazon S3) and Amazon CloudWatch, to stream across different accounts. Streaming data delivery works for publicly accessible OpenSearch Service clusters, whether or not fine-grained access control (FGAC) is turned on.

To set up a Kinesis Data Firehose stream so that it sends data to an OpenSearch Service cluster, complete the following steps:

  1. Create an Amazon S3 bucket in Account A.
  2. Create a CloudWatch log group and log stream in Account A.
  3. Create a Kinesis Data Firehose role and policy in Account A.
  4. Create a publicly accessible OpenSearch Service cluster in Account B for the Kinesis Data Firehose role in Account A to stream data to.
  5. (Optional) If FGAC is turned on, then log in to OpenSearch Dashboards and add a role mapping.
  6. Update the AWS Identity Access Management (IAM) role policy for your Kinesis Data Firehose role in Account A to send data to Account B.
  7. Create the Kinesis Data Firehose stream in Account A.
  8. Test cross-account streaming to the OpenSearch Service cluster.

Resolution

Create an Amazon S3 bucket in Account A

Create an S3 bucket in Account A. The Amazon S3 bucket generates an Amazon Resource Name (ARN).

Note: The complete ARN is used later to grant Kinesis Data Firehose access to save and retrieve records from the Amazon S3 bucket.

Create a CloudWatch Log Group and Log Stream in Account A

To create a CloudWatch Log Group, complete the following steps:

  1. Open the CloudWatch console.
  2. In the navigation pane, choose Logs, and then choose Log groups.
  3. Choose Create log group.
  4. Enter a Log Group name.
  5. Choose the Create log group button to save your new log group.
  6. Search for your newly created log group, and then select it.

To create an Amazon CloudWatch log stream, complete the following steps:

  1. Choose Create Log Stream.
  2. Enter a Log Stream Name.
  3. Choose Create Log Stream.
    Important: The CloudWatch log group and CloudWatch log stream names are required when you create Kinesis Data Firehose role policies.

Create a Kinesis Data Firehose role and a policy in Account A

  1. Open the AWS Identity and Access Management (IAM) console.
  2. Create an IAM policy that allows Kinesis Data Firehose to do the following:
    Saves stream logs to CloudWatch
    Records to Amazon S3
    Streams data to the OpenSearch Service cluster

Example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads",
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": [
        "<Bucket ARN>",
        "<Bucket ARN>/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:PutLogEvents"
      ],
      "Resource": [
        "arn:aws:logs:<region>:<account-id>:log-group:/aws/kinesisfirehose/<Firehose Name>:log-stream:*"
      ]
    }
  ]
}

Note: You append permissions to stream to the OpenSearch Service cluster policy later on. However, you must first create the cluster in Account B.

  1. Save the policy.
  2. Choose Create a role.
  3. Add the policy to your Kinesis Data Firehose role.

Create a publicly accessible OpenSearch Service cluster in Account B to for the Kinesis Data Firehose role in Account A to stream data

  1. Create your publicly accessible OpenSearch Service cluster in Account B.
  2. Record the OpenSearch Service domain ARN. You use the ARN in a later step.
  3. Configure your security settings for your cluster.
    Important: You must configure your OpenSearch Service security settings to allow the Kinesis Data Firehose role in Account A to stream to your OpenSearch Service cluster.

To configure your security settings, perform the following steps:

  1. In OpenSearch Service, navigate to Access policy.

  2. Select the JSON defined access policy. Your policy must have the following permissions:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "*"
          },
          "Action": "es:*",
          "Resource": "<ES Domain ARN in Account B>/*",
          "Condition": {
            "IpAddress": {
              "aws:SourceIp": "<Your IP Address for OpenSearch Dashboards access>"
            }
          }
        },
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "<Firehose Role ARN in Account A>"
          },
          "Action": [
            "es:ESHttpPost",
            "es:ESHttpPut"
          ],
          "Resource": [
            "<ES Domain ARN in Account B>",
            "<ES Domain ARN in Account B>/*"
          ]
        },
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "<Firehose Role ARN in Account A>"
          },
          "Action": "es:ESHttpGet",
          "Resource": [
            "<ES Domain ARN in Account B>/_all/_settings",
            "<ES Domain ARN in Account B>/_cluster/stats",
            "<ES Domain ARN in Account B>/index-name*/_mapping/type-name",
            "<ES Domain ARN in Account B>/roletest*/_mapping/roletest",
            "<ES Domain ARN in Account B>/_nodes",
            "<ES Domain ARN in Account B>/_nodes/stats",
            "<ES Domain ARN in Account B>/_nodes/*/stats",
            "<ES Domain ARN in Account B>/_stats",
            "<ES Domain ARN in Account B>/index-name*/_stats",
            "<ES Domain ARN in Account B>/roletest*/_stats"
          ]
        }
      ]
    }

    For more information about permissions within the OpenSearch Service policy, see Cross-account delivery to an OpenSearch Service destination.

  3. (Optional) If FGAC is turned on for your cluster, then log in to OpenSearch Dashboards and add a role mapping. The role mapping allows the Kinesis Data Firehose role to send requests to OpenSearch Service.

To log in to OpenSearch Dashboards and add a role mapping, complete the following steps:

  1. Open Dashboards.
  2. Choose the Security tab.
  3. Choose Roles.
  4. Choose the all_access role.
  5. Choose the Mapped users tab.
  6. Choose Manage mapping.
  7. In the Backend roles section, enter the Kinesis Data Firehose role.
  8. Choose Map.

Update the IAM role policy for your Kinesis Data Firehose role in Account A to send data to Account B

To send data from your Kinesis Data Firehose role in Account A to your OpenSearch Service cluster in Account B, update the Kinesis Data Firehose policy.

Example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads",
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": [
        "<Bucket ARN>",
        "<Bucket ARN>/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:PutLogEvents"
      ],
      "Resource": [
        "arn:aws:logs:<region>:<account-id>:log-group:/aws/kinesisfirehose/<Firehose Name>:log-stream:*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "es:ESHttpPost",
        "es:ESHttpPut",
        "es:DescribeDomain",
        "es:DescribeDomains",
        "es:DescribeDomainConfig"
      ],
      "Resource": [
        "<Domain ARN in Account B>",
        "<Domain ARN in Account B>/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "es:ESHttpGet"
      ],
      "Resource": [
        "<Domain ARN in Account B>/_all/_settings",
        "<Domain ARN in Account B>/_cluster/stats",
        "<Domain ARN in Account B>/index-name*/_mapping/superstore",
        "<Domain ARN in Account B>/_nodes",
        "<Domain ARN in Account B>/_nodes/stats",
        "<Domain ARN in Account B>/_nodes/*/stats",
        "<Domain ARN in Account B>/_stats",
        "<Domain ARN in Account B>/index-name*/_stats"
      ]
    }
  ]
}

For more information, see Grant Kinesis Data Firehose access to an Amazon OpenSearch Service destination.

Create the Kinesis Data Firehose stream in Account A

To create a Kinesis Data Firehose stream with cross-account access to an OpenSearch Service cluster, use and configure the AWS Command Line Interface (AWS CLI).

Check to make sure that your AWS CLI is up-to-date:

aws --version

Note: If you receive errors when you run AWS CLI commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

Create a file called input.json with the following content:

{
  "DeliveryStreamName": "<Firehose Name>",
  "DeliveryStreamType": "DirectPut",
  "ElasticsearchDestinationConfiguration": {
    "RoleARN": "",
    "ClusterEndpoint": "",
    "IndexName": "local",
    "TypeName": "TypeName",
    "IndexRotationPeriod": "OneDay",
    "BufferingHints": {
      "IntervalInSeconds": 60,
      "SizeInMBs": 50
    },
    "RetryOptions": {
      "DurationInSeconds": 60
    },
    "S3BackupMode": "FailedDocumentsOnly",
    "S3Configuration": {
      "RoleARN": "",
      "BucketARN": "",
      "Prefix": "",
      "BufferingHints": {
        "SizeInMBs": 128,
        "IntervalInSeconds": 128
      },
      "CompressionFormat": "UNCOMPRESSED",
      "CloudWatchLoggingOptions": {
        "Enabled": true,
        "LogGroupName": "/aws/kinesisfirehose/<Firehose Name>",
        "LogStreamName": "S3Delivery"
      }
    },
    "CloudWatchLoggingOptions": {
      "Enabled": true,
      "LogGroupName": "/aws/kinesisfirehose/<Firehose Name>",
      "LogStreamName": "ElasticsearchDelivery"
    }
  }
}

Make sure that the endpoint value is correctly entered in the ClusterEndpoint attribute field.

Note: Types are deprecated in Elasticsearch version 7.x. For Elasticsearch versions 7.x, remove the TypeName attribute from the input.json file.

Then, run the following command in the same directory as the location of the input.json file:

aws firehose create-delivery-stream --cli-input-json file://input.json

This command syntax creates a Kinesis Data Firehose stream in Account A with a destination to an OpenSearch Service cluster in Account B.

Test cross-account streaming to the OpenSearch Service cluster

Use the Kinesis Data Generator (KDG) to stream records into the Kinesis Data Firehose stream in Account A.

The KDG generates many records per second. This productivity level allows OpenSearch Service to have enough data points to determine the correct mapping of a record structure.

The following is the template structure used in the Kinesis Data Generator:

{
    "device_id": {{random.number(5)}},
    "device_owner": "{{name.firstName}}  {{name.lastName}}",
    "temperature": {{random.number(
        {
            "min":10,
            "max":150
        }
    )}},
    "timestamp": "{{date.now("DD/MMM/YYYY:HH:mm:ss Z")}}"
}

To verify that cross-account streaming was successful, review the index entries under the Indices tab of your cluster. Check if there is an index name that uses the prefix "local" with the current date. You can also check if the records are present in OpenSearch Dashboards.

Note: OpenSearch Service takes a few minutes to determine the correct mapping.

Related information

Creating an Amazon Kinesis Data Firehose delivery stream

Writing to Kinesis Data Firehose using Kinesis Data Streams

AWS OFFICIAL
AWS OFFICIALUpdated 4 months ago