Glue Ray: getting "Error while writing result to S3 working directory" even when the job runs successfully

0

I have a glue job of type "Ray" that was deployed using CDK. I'm using the following parameters for the job run: --enable-glue-datacatalog true library-set analytics --TempDir s3://{bucket}/temporary/ --additional-python-modules s3://{bucket}/{module}.zip

The job has a role which has access to the buckets for both TempDir and additional-python modules. When looking at the logs in cloudwatch, I can see that the job does everything it's supposed to do, but in the console, the job fails wit the error "Error while writing result to S3 working directory". I can't find any details in any of the log groups.

asked a year ago406 views
2 Answers
0

You are still missing some permissions to write working directory files. The location is under

Glue job > Details Tab > Advanced > Temporary Path

These are the permissions you would need:

{
            "Action": [
                "s3:Abort*",
                "s3:DeleteObject*",
                "s3:GetBucket*",
                "s3:GetObject*",
                "s3:List*",
                "s3:PutObject",
                "s3:PutObjectLegalHold",
                "s3:PutObjectRetention",
                "s3:PutObjectTagging",
                "s3:PutObjectVersionTagging"
            ],
            "Resource": [
                "arn:aws:s3:::glue-assets-xxxxxxx",
                "arn:aws:s3:::glue-assets-xxxxxxx/*"
            ],
            "Effect": "Allow"
        }
profile pictureAWS
answered a year ago
0

Hi, thanks! This helped me get at the issue.

The solution is that all Glue Ray jobs need the above Put access for the location of the script. The reason my job was failing was that my stack setup only allowed get access to the bucket where the glue scripts are stored.

If your script is stored at s3://{script_base_path}/my_script.py, Glue Ray seems to want to put some metadata objects at

s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/metadata
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/result
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/stderr
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/stdout

at the end of every job run.

This does not happen for other types of glue jobs.

This location does not seem to be configurable to be anything but the "script base path" like the answer indicated.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions