Skip to content

glue catalog cross account

0

Hi, the Glue catalog cross-account needs customer-managed keys. when we create a new KMS key, change the KMS key in the Glue catalog setting, and run crawlers, it does not mean to encrypt existing tables and data. so we cannot query them in another account. I guess, I need to re-encrypt data with a new KMS key. Is there any way to avoid this re-encrypt? If not, what is the best way to re-encrypt existing data? [https://docs.aws.amazon.com/athena/latest/ug/security-iam-cross-account-glue-catalog-access.html].

4 Answers
1
Accepted Answer

You're correct: if your data is encrypted with an AWS-managed KMS key, there's no way you can permit access to it from another AWS account. You can only do that with a customer-managed KMS key, because you can control the key policy and optionally create grants for cross-account access.

How to re-encrypt your data would depend on where it's stored and how. Is the encrypted data stored in S3 or in some other service? For S3, the procedure could be as simple as setting the default encryption of the S3 bucket to use the customer-managed KMS key for SSE-KMS, followed by creating copies within S3 of all your objects to the same keys. The new copies would get encrypted with the new key by default.

You could consider using S3 Batch Jobs for creating the copies: https://docs.aws.amazon.com/AmazonS3/latest/userguide/batch-ops-copy-object.html

Note that the CopyObject API in S3 has a maximum object size of 5 GB, so the simple "in-place copy" will only work if your objects aren't larger. For larger objects, copies can still be created inside S3 but not with S3 Batch Jobs but with custom code that would know to use the UploadPartCopy API with multipart uploads to copy the objects in parts no larger than 5 GB each.

EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
  • If we change the encryption type of the source S3 bucket to SSE-KMS, does it affect something? For example, will this data still be queryable by Athena in the trusted and trusting accounts?

  • I just realized that my current data was not encrypted by AWS-managed KMS key. the encryption type of bucket is SSE-S3. So for the glue catalog cross account, can we keep it SSE-S3 type for the bucket where we store data?

  • @gh-v Yes, you can keep your data encrypted with SSE-S3. You won't get the benefit of separating access to encryption keys from access to the encrypted data, but technically, you can have your Glue Catalog use KMS for metadata encryption while using SSE-S3 to encrypt the data in S3.

  • thanks , I am using SSE-s3 and did all setting we need for cross account glue catalog. the weird thing is that, when I create new crawler ,the new table is queryable in Account B(other accounts) . but existing tables are not. however, I re run crawler for all tables but still does not work for the tables that existed. do I need to re create tables? I try to find a way to avoid this.

1

Without using bucket keys, S3 will execute one kms:Decrypt call for each CopyObject API call for the source object and one kms:GenerateDataKey call per CopyObject call for the target object. With the same assumption as before that all the objects are no larger than 5 GB each and that each object will be copied with a single CopyObject API call, you can calculate your KMS cost by multiplying the cost for a single API call by the number of objects you're re-encrypting and multiplying the result by the 2 KMS calls per object.

EXPERT
answered 2 years ago
  • So without bucket key it is still cheap. for example if you have 10.000 objects the cost of re-encryption is: 10,000 * 2 = 20,000 calls (20,000 / 10,000 )* $0.03 = $0.06

    What about any additional costs like S3 CopyObject operations? should we consider anything else?

  • Yes, for only the re-encryption operation, there's limited benefit from bucket keys. However, accessing the objects later, long after the one-time re-encryption, bucket keys will decrease the cost of accessing your objects. For other costs, you'll indeed pay for the S3 API calls. If your objects are in the Standard storage class and you run the copy via Batch Operations and possibly use an S3 inventory report as the source, you'll be charged for using S3 Inventory and S3 Batch Operations. If you have only 10,000 objects, all those costs will be quite small, but do review the price list.

1

All the principals (IAM roles, IAM users) that are used to access the data will have to be granted the necessary permissions to the KMS key. This is exactly how KMS encryption facilitates security: it decouples access controls to the stored, encrypted data from the access controls to the key. Only having access to one or the other won't suffice, making it that much less likely for access to be granted inadvertently or to be obtained maliciously.

In the cross-account scenario, you will have to permit access to the key in the identity-based policies (policies attached to the IAM roles/users) in the account where the IAM roles/users reside. Additionally, in the AWS account where the key resides, you will have to permit access for the same roles/users in the KMS key policy. With only one or the other side of this cross-account "trust hug", access would be implicitly denied.

For S3 SSE-KMS, the needed permissions are kms:GenerateDataKey and kms:Decrypt (the latter needed both for reading and for multipart uploads).

EXPERT
answered 2 years ago
  • thanks, do you have any opinion to estimate the cost of this re-encryption? I know key-usage cost is $0.03 per 10,000 requests..but don't know how we can estimate the encryption cost. for example consider we have 50 G data in this bucket totally (10000 files)

1

You can reduce KMS costs for SSE-KMS by using Bucket Keys: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-key.html

Classical SSE-KMS would ask KMS to generate a unique data key for each object and to encrypt it with the KMS key. The cost for KMS calls would scale linearly with the number of objects written.

With bucket keys, S3 uses KMS to generate and encrypt bucket keys instead. Multiple unique data keys for individual objects are cryptographically derived from a single bucket key, without involving KMS. Each bucket key is only used for a limited amount of time, so the same cryptographic material (of the bucket key) is only used for deriving data keys (for individual objects) for the limited number of objects written within that time window.

For decryption, when a bucket key -encrypted object is accessed, the first operation requires a call to KMS to decrypt the bucket key. However, S3 will retain the decrypted bucket key for a limited amount of time, allowing other objects using the same bucket key to be decrypted without invoking KMS for that amount of time.

In practice, when you re-encrypt a large number of objects rapidly within a short period of time and are using bucket keys, and assuming that the objects are no larger than 5 GB in size each, you should expect that the number of KMS operations against the new KMS key you're charged for will be around 1% (as stated in the document above) the number of CopyObject calls you make, i.e., the number of objects you copy. If the old objects are also using SSE-KMS without using bucket keys, there will additionally be a single kms:Decrypt call per object against the old key.

The cheapest option would be not to use SSE-KMS but SSE-S3 instead. It has no encryption charges at all and does work across account boundaries. However, even though the data is encrypted at rest, you gain no control over the keys and thereby no security benefit from separating the access controls over the keys from controls over the data.

EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.