Has --java-options been removed in Glue 5.0?

0

Looking on this page: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html I do not see this java-options as an option.

We use:  "--java-options" = "-Ddb2.jcc.charsetDecoderEncoder=3"

To filter unusual characters that come thru a DB2 database. It worked fine up until Glue 4, but in Glue 5 we have had jobs fail and the error message is:

Caused by: java.nio.charset.MalformedInputException: Input length = 1

Any suggestions for a workaround?

asked 2 months ago135 views
2 Answers
0

We tried to use the --conf option that is visible in the docs, however all combinations we have tried have failed with the error:

LAUNCH ERROR | Glue bootstrap failed. Please refer logs for details. caused by LAUNCH ERROR | Invalid input to --confPlease refer logs for details.

We have tried: key: --conf value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3,spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3

key: --conf value: spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3

key: --conf value: "spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3 --conf spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3"

key: --conf value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3 --conf spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3

key: --conf value: "spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3"

However, this option worked (not our solution) key: --conf value: spark.executor.extraJavaOptions=-XX:+PrintGCTimeStamps

This makes me suspect that since we are trying to set an option on the DB2 library, is the Glue environment trying to set the --conf options before the --extra-jars are loaded to the classpath? Would this cause the --conf to fail because Spark/Java cannot find the option to set?

answered 2 months ago
-1

Based on the information available, there is no explicit mention of the removal of "--java-options" in AWS Glue 5.0. However, it's important to note that AWS Glue 5.0 has undergone significant changes, including upgrades to Java 17 and Spark 3.5.2.

The error you're encountering ("java.nio.charset.MalformedInputException: Input length = 1") suggests an issue with character encoding, which is what your Java option was addressing in previous versions.

While I can't confirm if "--java-options" has been removed, here are a few potential workarounds you could try:

  1. Use the "--conf" option instead. You might be able to set the same property using Spark configuration: Key : --conf, value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3

Or else :

Use key : --executor-java-options , value : -Ddb2.jcc.charsetDecoderEncoder=3 Key : --driver-java-options , value : -Ddb2.jcc.charsetDecoderEncoder=3

  1. If you're using a JDBC connection to DB2, you might be able to set this property in the connection URL or in the connection options.

  2. Consider using the new "requirements.txt" support in Glue 5.0 to install additional Python libraries that could help with character encoding issues.

  3. Check if there are any new Glue 5.0 specific options or configurations that address character encoding for DB2 connections.

  4. As Glue 5.0 uses Java 17, investigate if there are any Java 17 specific ways to handle this character encoding issue.

If these workarounds don't resolve the issue, you may need to contact AWS support for more specific guidance on handling DB2 character encoding in Glue 5.0. They might be able to provide more targeted solutions or confirm if there have been changes to how Java options are handled in this version.
Sources
AWS Glue versions - AWS Glue
Develop and test AWS Glue 5.0 jobs locally using a Docker container | AWS Big Data Blog

profile picture
answered 2 months ago
AWS
SUPPORT ENGINEER
revised 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions