跳至內容

GPG Not Available in AWS Glue 5

0

Hello AWS Support Team,

We are currently in the process of upgrading our AWS Glue Jobs from version 4 to version 5 to take advantage of the latest features and long-term support.

However, we’ve encountered a critical issue during this migration:

In Glue version 4, we used the gpg utility (installed via the system package manager) to decrypt .gpg files as part of our ETL pipeline. This worked without issues.

After migrating to Glue version 5, we discovered that the gpg binary is no longer available in the runtime environment. Additionally, it’s not possible to install it via yum, dnf.

From what we’ve found, Glue v5 is based on Amazon Linux 2023, and it seems that gpg is not currently available there (reference issue).

This is a blocker for our Glue v5 migration, as decrypting PGP-encrypted files is a required step in our workflow. Is there an AWS-recommended approach or supported workaround for using GPG functionality in Glue v5? Thanks

  • Actually, the issue is related to the missing gpg-agent. The gpg (v2.3.7) itself exists, but gpg-agent is missing.

已提問 4 個月前檢視次數 186 次
1 個回答
0

When upgrading from AWS Glue 4.0 to 5.0, you're encountering a significant change in the underlying environment. AWS Glue 5.0 is based on Amazon Linux 2023 and uses Java 17, which explains the differences you're experiencing with the GPG utility.

For handling encrypted files in AWS Glue 5.0, you have several options:

  1. Use Python-based GPG libraries: Since AWS Glue 5.0 supports Python 3.11, you could use Python libraries like python-gnupg or PGPy to handle GPG encryption/decryption tasks. AWS Glue 5.0 also supports using a requirements.txt file to install additional Python libraries, which would allow you to add these dependencies.

  2. Consider AWS encryption services: If possible, you might want to redesign your workflow to use AWS native encryption services like AWS KMS (Key Management Service) instead of GPG. AWS Glue supports encrypting data at rest using AWS KMS keys.

  3. Pre-process your files: You could consider decrypting the GPG files before they reach your Glue job, perhaps using a Lambda function or another service in your data pipeline that has access to the GPG utility.

  4. Use AWS Glue's security configurations: For data that needs to be encrypted, AWS Glue provides security configurations that allow you to encrypt data written to Amazon S3 and CloudWatch Logs.

If these alternatives don't meet your requirements and you absolutely need the GPG binary, you might need to either:

  • Stay on AWS Glue 4.0 until a solution becomes available
  • Reach out to AWS Support for specific guidance on your use case

When migrating to AWS Glue 5.0, it's important to review all the changes and limitations documented in the release notes to ensure compatibility with your existing workflows.
Sources
Encrypting data at rest - AWS Glue
Encrypting data written by AWS Glue - AWS Glue
AWS Glue versions - AWS Glue

已回答 4 個月前
  • We are currently using python-gnupg along with the gpg binary to encrypt and decrypt files .gpg in our Glue jobs.

    Since python-gnupg depends on the gpg executable, and gpg is not available in AWS Glue 5.0 (and cannot be installed), this solution no longer works.

    Could you please confirm if there is any plan to support gpg in Glue 5.0 or provide an official workaround?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。