- Newest
- Most votes
- Most comments
We tried to use the --conf option that is visible in the docs, however all combinations we have tried have failed with the error:
LAUNCH ERROR | Glue bootstrap failed. Please refer logs for details. caused by LAUNCH ERROR | Invalid input to --confPlease refer logs for details.
We have tried: key: --conf value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3,spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3
key: --conf value: spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3
key: --conf value: "spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3 --conf spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3"
key: --conf value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3 --conf spark.executor.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3
key: --conf value: "spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3"
However, this option worked (not our solution) key: --conf value: spark.executor.extraJavaOptions=-XX:+PrintGCTimeStamps
This makes me suspect that since we are trying to set an option on the DB2 library, is the Glue environment trying to set the --conf options before the --extra-jars are loaded to the classpath? Would this cause the --conf to fail because Spark/Java cannot find the option to set?
Based on the information available, there is no explicit mention of the removal of "--java-options" in AWS Glue 5.0. However, it's important to note that AWS Glue 5.0 has undergone significant changes, including upgrades to Java 17 and Spark 3.5.2.
The error you're encountering ("java.nio.charset.MalformedInputException: Input length = 1") suggests an issue with character encoding, which is what your Java option was addressing in previous versions.
While I can't confirm if "--java-options" has been removed, here are a few potential workarounds you could try:
- Use the "--conf" option instead. You might be able to set the same property using Spark configuration: Key : --conf, value: spark.driver.extraJavaOptions=-Ddb2.jcc.charsetDecoderEncoder=3
Or else :
Use key : --executor-java-options , value : -Ddb2.jcc.charsetDecoderEncoder=3 Key : --driver-java-options , value : -Ddb2.jcc.charsetDecoderEncoder=3
-
If you're using a JDBC connection to DB2, you might be able to set this property in the connection URL or in the connection options.
-
Consider using the new "requirements.txt" support in Glue 5.0 to install additional Python libraries that could help with character encoding issues.
-
Check if there are any new Glue 5.0 specific options or configurations that address character encoding for DB2 connections.
-
As Glue 5.0 uses Java 17, investigate if there are any Java 17 specific ways to handle this character encoding issue.
If these workarounds don't resolve the issue, you may need to contact AWS support for more specific guidance on handling DB2 character encoding in Glue 5.0. They might be able to provide more targeted solutions or confirm if there have been changes to how Java options are handled in this version.
Sources
AWS Glue versions - AWS Glue
Develop and test AWS Glue 5.0 jobs locally using a Docker container | AWS Big Data Blog
Relevant content
- asked 2 months ago
- asked 2 years ago
- AWS OFFICIALUpdated 17 days ago
- AWS OFFICIALUpdated 4 years ago