Glue Ray: getting "Error while writing result to S3 working directory" even when the job runs successfully

0

I have a glue job of type "Ray" that was deployed using CDK. I'm using the following parameters for the job run: --enable-glue-datacatalog true library-set analytics --TempDir s3://{bucket}/temporary/ --additional-python-modules s3://{bucket}/{module}.zip

The job has a role which has access to the buckets for both TempDir and additional-python modules. When looking at the logs in cloudwatch, I can see that the job does everything it's supposed to do, but in the console, the job fails wit the error "Error while writing result to S3 working directory". I can't find any details in any of the log groups.

已提問 1 年前檢視次數 413 次
2 個答案
0

You are still missing some permissions to write working directory files. The location is under

Glue job > Details Tab > Advanced > Temporary Path

These are the permissions you would need:

{
            "Action": [
                "s3:Abort*",
                "s3:DeleteObject*",
                "s3:GetBucket*",
                "s3:GetObject*",
                "s3:List*",
                "s3:PutObject",
                "s3:PutObjectLegalHold",
                "s3:PutObjectRetention",
                "s3:PutObjectTagging",
                "s3:PutObjectVersionTagging"
            ],
            "Resource": [
                "arn:aws:s3:::glue-assets-xxxxxxx",
                "arn:aws:s3:::glue-assets-xxxxxxx/*"
            ],
            "Effect": "Allow"
        }
profile pictureAWS
已回答 1 年前
0

Hi, thanks! This helped me get at the issue.

The solution is that all Glue Ray jobs need the above Put access for the location of the script. The reason my job was failing was that my stack setup only allowed get access to the bucket where the glue scripts are stored.

If your script is stored at s3://{script_base_path}/my_script.py, Glue Ray seems to want to put some metadata objects at

s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/metadata
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/result
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/stderr
s3://{script_base_path}/jobs/{job_name}/{job_run_id}/job-result/stdout

at the end of every job run.

This does not happen for other types of glue jobs.

This location does not seem to be configurable to be anything but the "script base path" like the answer indicated.

已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南