Hello,
I have a .sql.gz file (~50gb) on S3 - I'm attempting to download it, unzip it, and upload the decompressed contents back to S3 (as .sql).
The Glue job is able to successfully decompress/upload smaller files (largest I've tested is ~1gb).
However, whenever I attempt to process the larger ~50gb file I get back the following error:
"Command failed with exit code 10"
job_run_id:
jr_6517bedbe85935d03bca9b8797df4d357885398c86a3b907dee4c7f8dab42b6f
Some info about the job:
--No. of workers => 75 (G1.X)
--Glue Version ==> Spark 2.4, Python 3 (Glue Version 2.0)
Source code:
import boto3
from io import BytesIO
import gzip
s3_bucket_name = 'some-bucket'
stage_s3_key_prefix = 'prefix/to/stage'
source_s3_key = f'{stage_s3_key_prefix}/somefile.sql.gz'
target_s3_key = f'{stage_s3_key_prefix}/somefile.sql'
s3_client = boto3.client('s3')
s3_client.upload_fileobj(
Fileobj=gzip.GzipFile(
None,
'rb',
fileobj=BytesIO(
s3_client.get_object(
Bucket=s3_bucket_name,
Key=source_s3_key
// line below should be: body.read()
// I cant index Body b/c this markdown will display a URL instead
)[" "].read()
)
),
Bucket=s3_bucket_name,
Key=target_s3_key
)
Complete error message:
timestamp | message |
---|
1603652066955 | awsglue-todworkers-iad-prod-2d-37f92aea.us-east-1.amazon.com Mon Sep 28 18:09:19 UTC 2020 gluetod |
1603652066957 | Preparing ... |
1603652067097 | Sun Oct 25 18:54:26 UTC 2020 |
1603652067098 | /usr/bin/java -cp /opt/amazon/conf:/opt/amazon/lib/hadoop-lzo/:/opt/amazon/lib/emrfs-lib/:/opt/amazon/spark/jars/:/opt/amazon/superjar/:/opt/amazon/lib/:/opt/amazon/Scala2.11/ com.amazonaws.services.glue.PrepareLaunch --conf spark.dynamicAllocation.enabled=true --conf spark.shuffle.service.enabled=true --conf spark.dynamicAllocation.minExecutors=1 --conf spark.dynamicAllocation.maxExecutors=74 --conf spark.executor.memory=10g --conf spark.executor.cores=8 --conf spark.driver.memory=10g --conf spark.default.parallelism=600 --conf spark.sql.shuffle.partitions=600 --conf spark.network.timeout=600 --JOB_ID j_4a14dc4e1fdbd099a4fb00ce7bfa27d1cfea60c075858eee61aa091355122e90 --JOB_RUN_ID jr_6517bedbe85935d03bca9b8797df4d357885398c86a3b907dee4c7f8dab42b6f --job-bookmark-option job-bookmark-disable --scriptLocation s3://roivant-data/scripts/unzip_gzip_on_s3 --job-language python --TempDir s3://aws-glue-temporary-692327028194-us-east-1/admin --JOB_NAME unzip_gzip_on_s3 |
1603652088320 | 1603652088317 |
1603652089451 | SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/amazon/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] |
1603652089451 | SLF4J: Found binding in [jar:file:/opt/amazon/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/amazon/lib/log4j-slf4j-impl-2.8.jar!/org/slf4j/impl/StaticLoggerBinder.class] |
1603652089451 | SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. |
1603652089455 | SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] |
1603652093703 | WARN 2020-10-25 18:54:53,703 0 com.amazonaws.http.apache.utils.ApacheUtils [main] NoSuchMethodException was thrown when disabling normalizeUri. This indicates you are using an old version (< 4.5.8) of Apache http client. It is recommended to use http client version >= 4.5.9 to avoid the breaking change introduced in apache client 4.5.7 and the latency in exception handling. See https://github.com/aws/aws-sdk-java/issues/1919 for more information |
1603652094245 | Launching ... |
1603652094246 | Sun Oct 25 18:54:54 UTC 2020 |
1603652094411 | /usr/bin/java -cp /tmp:/opt/amazon/conf:/opt/amazon/lib/hadoop-lzo/:/opt/amazon/lib/emrfs-lib/:/opt/amazon/lib/emr-goodies/:/opt/amazon/lib/hive-jars/:/opt/amazon/spark/jars/:/opt/amazon/superjar/:/opt/amazon/lib/:/opt/amazon/Scala2.11/:/tmp/** -Dlog4j.configuration=log4j -server -Xmx10g -XX:_UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:_CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' -XX:+UseCompressedOops -Djavax.net.ssl.trustStore=/opt/amazon/certs/ExternalAndAWSTrustStore.jks -Djavax.net.ssl.trustStoreType=JKS -Djavax.net.ssl.trustStorePassword=amazon -DRDS_ROOT_CERT_PATH=/opt/amazon/certs/rds-combined-ca-bundle.pem -DREDSHIFT_ROOT_CERT_PATH=/opt/amazon/certs/redshift-ssl-ca-cert.pem -DRDS_TRUSTSTORE_URL=file:/opt/amazon/certs/RDSTrustStore.jks -Dspark.network.timeout=600 -Dspark.dynamicAllocation.enabled=false -Dspark.dynamicAllocation.minExecutors=1 -Dspark.shuffle.service.enabled=false -Dspark.hadoop.mapred.output.committer.class=org.apache.hadoop.mapred.DirectOutputCommitter -Dspark.driver.extraClassPath=/tmp:/opt/amazon/conf:/opt/amazon/lib/hadoop-lzo/:/opt/amazon/lib/emrfs-lib/:/opt/amazon/lib/emr-goodies/:/opt/amazon/lib/hive-jars/:/opt/amazon/spark/jars/:/opt/amazon/superjar/:/opt/amazon/lib/:/opt/amazon/Scala2.11/ -Dspark.glue.JOB_NAME=unzip_gzip_on_s3 -Dspark.dynamicAllocation.maxExecutors=74 -Dspark.default.parallelism=600 -Dspark.hadoop.lakeformation.credentials.url=http://localhost:9998/lakeformationcredentials -Dspark.sql.shuffle.partitions=600 -Dspark.app.name=nativespark-unzip_gzip_on_s3-jr_6517bedbe85935d03bca9b8797df4d357885398c86a3b907dee4c7f8dab42b6f -Dspark.glue.GLUE_TASK_GROUP_ID=54429570-df1a-4dd3-9db8-61e6984666f5 -Dspark.hadoop.mapred.output.direct.EmrFileSystem=true -Dspark.glue.USE_PROXY=false -Dspark.eventLog.dir=/tmp/spark-event-logs/ -Dspark.rpc.askTimeout=600 -Dspark.executor.instances=74 -Dspark.executor.cores=8 -Dspark.driver.host=172.36.138.239 -Dspark.hadoop.fs.s3.impl=com.amazon.ws.emr.hadoop.fs.EmrFileSystem -Dspark.authenticate.secret=<HIDDEN> -Dspark.glue.JOB_RUN_ID=jr_6517bedbe85935d03bca9b8797df4d357885398c86a3b907dee4c7f8dab42b6f -Dspark.executor.memory=10g -Dspark.hadoop.mapred.output.direct.NativeS3FileSystem=true -Dspark.driver.memory=10g -Dspark.pyspark.python=/usr/bin/python3 -Dspark.glue.GLUE_COMMAND_CRITERIA=glueetl -Dspark.master=jes -Dspark.hadoop.mapreduce.fileoutputcommitter.marksuccessfuljobs=false -Dspark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 -Dspark.unsafe.sorter.spill.read.ahead.enabled=false -Dspark.hadoop.parquet.enable.summary-metadata=false -Dspark.hadoop.glue.michiganCredentialsProviderProxy=com.amazonaws.services.glue.remote.LakeformationCredentialsProvider -Dspark.executor.extraClassPath=/tmp:/opt/amazon/conf:/opt/amazon/lib/hadoop-lzo/:/opt/amazon/lib/emrfs-lib/:/opt/amazon/lib/emr-goodies/:/opt/amazon/lib/hive-jars/:/opt/amazon/spark/jars/:/opt/amazon/superjar/:/opt/amazon/lib/**:/opt/amazon/Scala2.11/* -Dspark.glue.GLUE_VERSION=2.0 -Dspark.glue.endpoint=https://glue-jes-prod.us-east-1.amazonaws.com -Dspark.ui.enabled=false -Dspark.files.overwrite=true -Dspark.authenticate=true com.amazonaws.services.glue.ProcessLauncher --launch-class org.apache.spark.deploy.PythonRunner /opt/amazon/bin/runscript.py /tmp/unzip_gzip_on_s3 --JOB_ID j_4a14dc4e1fdbd099a4fb00ce7bfa27d1cfea60c075858eee61aa091355122e90 --JOB_RUN_ID jr_6517bedbe85935d03bca9b8797df4d357885398c86a3b907dee4c7f8dab42b6f --job-bookmark-option job-bookmark-disable --TempDir s3://aws-glue-temporary-692327028194-us-east-1/admin --JOB_NAME unzip_gzip_on_s3 |
1603652095339 | SLF4J: Class path contains multiple SLF4J bindings. |
1603652095339 | SLF4J: Found binding in [jar:file:/opt/amazon/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] |
1603652095339 | SLF4J: Found binding in [jar:file:/opt/amazon/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] |
1603652095339 | SLF4J: Found binding in [jar:file:/opt/amazon/lib/log4j-slf4j-impl-2.8.jar!/org/slf4j/impl/StaticLoggerBinder.class] |
1603652095339 | SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. |
1603652095342 | SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] |
1603652421392 | 2020-10-25 19:00:21,384 INFO [main] glue.ProcessLauncher (Logging.scala:logInfo(54)): postprocessing |
1603652421419 | 2020-10-25 19:00:21,410 INFO [pool-1-thread-1] util.ShutdownHookManager (Logging.scala:logInfo(54)): Shutdown hook called |
--------------- | ------------------------------------------------------------------------------------------------------ |