Passer au contenu

GPG Not Available in AWS Glue 5

0

Hello AWS Support Team,

We are currently in the process of upgrading our AWS Glue Jobs from version 4 to version 5 to take advantage of the latest features and long-term support.

However, we’ve encountered a critical issue during this migration:

In Glue version 4, we used the gpg utility (installed via the system package manager) to decrypt .gpg files as part of our ETL pipeline. This worked without issues.

After migrating to Glue version 5, we discovered that the gpg binary is no longer available in the runtime environment. Additionally, it’s not possible to install it via yum, dnf.

From what we’ve found, Glue v5 is based on Amazon Linux 2023, and it seems that gpg is not currently available there (reference issue).

This is a blocker for our Glue v5 migration, as decrypting PGP-encrypted files is a required step in our workflow. Is there an AWS-recommended approach or supported workaround for using GPG functionality in Glue v5? Thanks

  • Actually, the issue is related to the missing gpg-agent. The gpg (v2.3.7) itself exists, but gpg-agent is missing.

demandé il y a 5 mois189 vues
1 réponse
0

When upgrading from AWS Glue 4.0 to 5.0, you're encountering a significant change in the underlying environment. AWS Glue 5.0 is based on Amazon Linux 2023 and uses Java 17, which explains the differences you're experiencing with the GPG utility.

For handling encrypted files in AWS Glue 5.0, you have several options:

  1. Use Python-based GPG libraries: Since AWS Glue 5.0 supports Python 3.11, you could use Python libraries like python-gnupg or PGPy to handle GPG encryption/decryption tasks. AWS Glue 5.0 also supports using a requirements.txt file to install additional Python libraries, which would allow you to add these dependencies.

  2. Consider AWS encryption services: If possible, you might want to redesign your workflow to use AWS native encryption services like AWS KMS (Key Management Service) instead of GPG. AWS Glue supports encrypting data at rest using AWS KMS keys.

  3. Pre-process your files: You could consider decrypting the GPG files before they reach your Glue job, perhaps using a Lambda function or another service in your data pipeline that has access to the GPG utility.

  4. Use AWS Glue's security configurations: For data that needs to be encrypted, AWS Glue provides security configurations that allow you to encrypt data written to Amazon S3 and CloudWatch Logs.

If these alternatives don't meet your requirements and you absolutely need the GPG binary, you might need to either:

  • Stay on AWS Glue 4.0 until a solution becomes available
  • Reach out to AWS Support for specific guidance on your use case

When migrating to AWS Glue 5.0, it's important to review all the changes and limitations documented in the release notes to ensure compatibility with your existing workflows.
Sources
Encrypting data at rest - AWS Glue
Encrypting data written by AWS Glue - AWS Glue
AWS Glue versions - AWS Glue

répondu il y a 5 mois
  • We are currently using python-gnupg along with the gpg binary to encrypt and decrypt files .gpg in our Glue jobs.

    Since python-gnupg depends on the gpg executable, and gpg is not available in AWS Glue 5.0 (and cannot be installed), this solution no longer works.

    Could you please confirm if there is any plan to support gpg in Glue 5.0 or provide an official workaround?

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.