- Newest
- Most votes
- Most comments
The simple way to store the data keys is together with the encrypted data, but with the data keys stored encrypted with the KMS key. This way, the durability of your data and the data keys naturally go hand-in-hand. A hardware failure or human error won't be likely to lose your keys while the data encrypted with the keys would survive.
When you use KMS's GenerateDataKey API to generate your data keys, you'll receive in the response both the plaintext data key that is ready for your code to use for cryptographic operations and the same data key encrypted with the KMS key in whose context the data key was generated. The encrypted data key will be in the CiphertextBlob property of the API response. You can safely store the KMS-encrypted data key and the ARN or ID of the corresponding KMS key together with your data. This is also explained and recommended briefly in the API documentation for GenerateDataKey: https://docs.aws.amazon.com/kms/latest/APIReference/API_GenerateDataKey.html
The point of rotating KMS keys is to limit the amount of data that gets encrypted with the same KMS key. One of the primary avenues to defeating any general symmetric cipher is to compare the ciphertext generated with a given key to the known or presumed plaintext equivalent, trying to exploit some discovered cryptographic weakness in the cipher to find ways to guess, typically one bit at a time, parts of the encryption key by observing correlations or other patterns between the ciphertext and plaintext. For this reason, only using a single KMS key version to encrypt data keys generated over a year's time, for example, would ensure that data keys generated and encrypted in subsequent years would be encrypted under a different KMS key version, which is cryptographically entirely independent of the previous year's KMS key.
For this reason, you should generally not decrypt the data keys generated in the past and re-encrypt them with new KMS keys. If you did this systematically for a data set that grows over time, the total amount of data encrypted under the current key would increase over time (because it would be directly proportional to the number of data keys used in the entire data set, rather than just the data keys for one year), until or if you started deleting old data to establish a time-wise cap on the size of the data set.
However, if you have a valid reason to suspect or assume that your data keys might have been compromised or that an adversary could be continuously capturing encrypted keys that they could somehow successfully brute-force offline but requiring time to do so, then decrypting the data, generating new data keys under a new KMS key, and re-encrypting the data with the new keys could conceivably keep buying you more time faster than they could brute-force your new keys. I would still think of this more as a conceptual thought exercise than practically viable use of encryption, particularly given that the "plaintext" in this case would be completely random AES encryption keys rather than typically partly predictable HTTP traffic, for example. If your situation prompting rotating data keys looks anything like that, it would deserve more thought than I'm giving it now.
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
Hey! I'm still confused, so I think it's best to share what I'm trying to achieve so it is easy for you to help me out!!
I need to do column-level encryption in my database. I will choose all PII fields and encrypt them. Initially, I thought I would create a data key for each column (PII) I want to encrypt and then store them somewhere, like in another database. When retrieving data, I would get the encrypted data key (from another db) and decrypt it using a master KMS key and decrypt the data.
However, after your insights on key rotation - "The point of rotating KMS keys is to limit the amount of data that gets encrypted with the same KMS key," I think we need to store the encrypted data key with the data itself because that is the only way to achieve that. But does this means - should I generate a data key whenever I insert or update a record(not column)?