How to detect existing use of SSE-C in your Amazon S3 buckets

7 minute read
Content level: Intermediate
2

In this article we detail how to detect objects encrypted with SSE-C within an S3 bucket, and provide guidance on scaling this approach

Administrators may wish to confirm they do not have any current, legitimate use of SSE-C within their Amazon S3 buckets. Once use of SSE-C is blocked, further requests to write objects using SSE-C would be denied.

What is SSE-C?

Server-side encryption (SSE) is about protecting data at rest. Using server-side encryption with customer-provided keys (SSE-C), you can store your data encrypted with your own encryption key. S3 encrypts the object with your key and then discards the key. When you retrieve an object, you must provide the same encryption key as part of your request. This is different to using Amazon S3 managed keys (SSE-S3) or AWS Key Management Service (AWS KMS) keys (SSE-KMS).

Detecting existing use

AWS CloudTrail data events can be queried (including at scale, with CloudTrail Lake), and audited in real time, for active use of SSE-C. This article focusses on checking for existing use, i.e. whether any of your S3 objects are currently encrypted with SSE-C. Amazon S3 Inventory provides a list of your objects and metadata, on a schedule that you define, and can be queried with Amazon Athena.

Configuring S3 Inventory

Note: It might take up to 48 hours for Amazon S3 to deliver the first inventory report.

To setup inventory reports, follow the steps on how on Configuring inventory by using the S3 console. The following values are recommended:

  • Inventory configuration name: The name will determine the path to the inventory data. Choose a standard format for this, such as <bucket name>-inventory.
  • Inventory scope:
    • Prefix: Leave this blank to include all objects.
    • Object Versions: Include all versions
  • Report details:
    • Destination bucket: Another S3 bucket in the same AWS Region.
      • It is possible to use the same S3 bucket, though this isn’t recommended as it can create a circular relationship.
    • Frequency: Weekly or Daily, depending on your preference. Your selection will not affect how quickly the first inventory report is delivered.
    • Output format: Apache Parquet.
      • Other formats are available, and will require adjustment to the Athena table configuration below.
    • Status: Enable
  • Inventory report encryption:
    • Server side encryption: Don’t specify an encryption key (i.e. use SSE-S3).
      • Use of an SSE-KMS key of your choice is available, and beyond the scope of this article.
  • Additional metadata fields: Choose all fields, as there is no additional charge for these (other than bytes stored), and the additional metadata may be helpful. It has also all been included in the Athena table configuration below. At a minimum, Encryption is required.

To do this programmatically, update and use this AWS Command Line Interface (CLI) example:

aws s3api put-bucket-inventory-configuration \
    --bucket bucketname \
    --id bucketname-inventory \
    --inventory-configuration '{
        "Destination": {
            "S3BucketDestination": {
                 "AccountId": "DestinationBucketAccountId"
                "Bucket": "arn:aws:s3:::bucketname",
                "Format": "Parquet"
            }
        },
        "IsEnabled": true,
        "Id": "bucketname-inventory",
        "IncludedObjectVersions": "All",
        "Schedule": {
            "Frequency": "Daily"
        },
        "OptionalFields": [
            "Size",
            "LastModifiedDate",
            "StorageClass",
            "ETag",
            "IsMultipartUploaded",
            "ReplicationStatus",
            "EncryptionStatus",
            "ObjectLockRetainUntilDate",
            "ObjectLockMode",
            "ObjectLockLegalHoldStatus",
            "IntelligentTieringAccessTier"
        ]
    }'

Note: If you create your inventory configuration through the Amazon S3 console, Amazon S3 automatically creates a bucket policy on the destination bucket that grants Amazon S3 write permission to the bucket. However, if you create your inventory configuration through the AWS CLI, AWS SDKs, or the Amazon S3 REST API, you must manually add a bucket policy on the destination bucket. An example is documented here.

Setting up Amazon Athena

If this is the first time you are using Athena in this account and region, use this guide to get started with Athena. Follow the guide up to and including step 7.

Creating a database and table

  1. In the Athena query editor, run the following command to create a database for Inventory Reports:
CREATE DATABASE s3_inventory_reports
  1. From the Database list on the left, choose s3_inventory_reports to make it your current database.
  2. Edit the query example below, as follows:
    1. Replace your_table_name with the name of the bucket you are creating the table for, or your preferred table name.
    2. Replace inventory-bucket with the destination bucket you configured for your inventory reports, and any optional prefix.
    3. Replace config-ID with the configuration name of your inventory (as specified above in Inventory configuration name.
CREATE EXTERNAL TABLE s3_inventory_reports.your_table_name(
         bucket string,
         key string,
         version_id string,
         is_latest boolean,
         is_delete_marker boolean,
         size bigint,
         last_modified_date timestamp,
         e_tag string,
         storage_class string,
         is_multipart_uploaded boolean,
         replication_status string,
         encryption_status string,
         object_lock_retain_until_date timestamp,
         object_lock_mode string,
         object_lock_legal_hold_status string,
         intelligent_tiering_access_tier string,
         bucket_key_status string,
         checksum_algorithm string,
         object_access_control_list string,
         object_owner string
) PARTITIONED BY (
        dt string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
  STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
  LOCATION 's3://inventory-bucket/config-ID/hive/'
  TBLPROPERTIES (
    "projection.enabled" = "true",
    "projection.dt.type" = "date",
    "projection.dt.format" = "yyyy-MM-dd-HH-mm",
    "projection.dt.range" = "2022-01-01-00-00,NOW",
    "projection.dt.interval" = "1",
    "projection.dt.interval.unit" = "HOURS"
  );

Querying with Athena

With Athena set up and the table created, you can now run a query to determine how many objects are encrypted using SSE-C. First, if you have multiple inventories for a bucket, you need the datetime of the most recent inventory. You can find this in the AWS console for your S3 bucket, under Management > Inventory Configurations:, in the Last export column.

S3 Inventory configurations

Or you can run the following query in Athena:

SELECT MAX(dt) from <inventory bucket table name>

The following query checks the encryption status of all objects in the bucket. If it returns 0, then there are no objects using SSE-C encryption.

SELECT 
    count(encryption_status)
FROM 
    <inventory bucket table name>
WHERE 
    encryption_status = 'SSE-C'
    AND dt = '<most recent inventory report date>'

If the previous query returned a count > 0, run the following query to output all the matching object key names and selected metadata:

SELECT 
    bucket,
    key,
    version_id,
    size,
    last_modified_date,
    encryption_status
FROM 
    <inventory bucket table name>
WHERE 
    encryption_status = 'SSE-C'
    AND dt = '<most recent inventory report date>'

Operating at scale

The blog Consolidate and query Amazon S3 Inventory reports for Region-wide object-level visibility provides a solution for consolidating S3 inventories from multiple buckets across different accounts into a central location per AWS Region, so that they can be queried at scale.

After deploying the solution, the following query can be used to return SSE-C encrypted objects across all accounts and buckets in an AWS Region:

SELECT 
    bucket,
    key,
    version_id,
    size,
    last_modified_date,
    encryption_status
FROM 
    "default"."inventory"
WHERE 
    encryption_status = 'SSE-C'

Cleaning up

To prevent ongoing costs, you may wish to disable or delete S3 Inventory configurations and delete stored inventories. We recommend using an S3 Lifecycle expiry rule to delete older inventory data.

To delete the Athena table and databases, run the following queries:

DROP table <inventory bucket table name>

DROP database s3_inventory_reports

Conclusion

Before blocking SSE-C, administrators may wish to check if there is any legitimate use of SSE-C within their Amazon S3 buckets, as blocking it would deny further requests to write objects using SSE-C. This article explained how to detect objects encrypted with SSE-C, and provided guidance on scaling this approach.