Cloudwatch logs not showing

0

I have implemented a sagemaker pipeline that contains processing/training/evaluation steps. I used to be able to see the logs of them all, but now, only evaluation and training jobs logs are shown in their cloudwatch groups. the log groups for processing jobs and training jobs are still present. I have deleted some old logs in the processing group to make room for new ones but that didnt help also. This happened before and got resolved by its own somehow, and now happened again. Could you please advise on this issue?

here is the code for my processing step:


script processor using custom docker image from ECR

script_processor = ScriptProcessor( command=["python3"], image_uri=processing_and_splitting_image, role=role, instance_count=processing_instance_count, instance_type=processing_instance_type, base_job_name="data-process", )

step_process = ProcessingStep( name="DataPreprocessingStep", code="s3://{}/sagemaker/scripts/processing.py".format(bucket_source), processor=script_processor, inputs=[ ProcessingInput( input_name="./data/raw_df.csv", source="s3://{}/sagemaker/data/raw_df.csv".format(bucket), destination="/opt/ml/processing/input", ) ], outputs=[ ProcessingOutput( output_name="train", source="/opt/ml/processing/train", destination="s3://{}/train".format(bucket), ), ProcessingOutput( output_name="test", source="/opt/ml/processing/test", destination="s3://{}/test".format(bucket), ), ], job_arguments=["--train-test-split-ratio", "0.2"], )

1 Answer
0

The common cause of this issue is due to missing necessary permissions in execution role to create cloudwatch log groups and log streams.

Permissions snippet from notebook'es execution role, whose logs are publishing successfully.

 {
            "Action": "logs:PutLogEvents",
            "Effect": "Allow",
            "Resource": "<<ARN_NO>>",
            "Sid": "Logs"
        },
        {
            "Action": [
                "logs:DescribeLogStreams",
                "logs:CreateLogStream",
                "logs:CreateLogGroup"
            ],
            "Effect": "Allow",
            "Resource": "<<ARN_NO>>",
            "Sid": "Logs2"
        }

Please review above IAM statements and modify to your requirement.

AWS
answered a year ago
  • Hello @Ram_AWS, thank you for your answer. I thought that this would be the result but it seems to be that the script inside the processing step is facing an issue while reading a csv file as follows: /opt/ml/processing/input/code/processing.py:153: DtypeWarning: Columns (14) have mixed types. Specify dtype option on import or set low_memory=False. df_raw = pd.read_csv(f"{base_dir}/input/raw_df.csv").

    It turns out that whenever I try to solve this by adding low_memory=False, the cloud watch logs won't be published. I am still looking for a solution for this csv file reading problem though, I would be glad if you have something to add here.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions