Need more guidance on how to check the data pipeline objects

1

Everyday a new emr cluster span up and terminated after completing the step job. Checking the cloudtrail, seems a Data Pipeline created it. I am not sure how to get more details like who created, what is that script, schedules etc. Importantly this involves additional unknown cost involved. Highly appreciate any assistance regarding this.

Vaas
已提問 5 個月前檢視次數 233 次
2 個答案
7

Hi,

Using CloudTrail you can get information about who generated the request. CloudTrail records events and actions related to the creation, modification, or execution of AWS Data Pipeline itself. The logs contain information about the user or role who initiated actions within AWS services like AWS Data pipeline and Amazon EMR. You can get the username, or role name associated with the user or service that initiated the action. This information helps identify who performed the specific operation within Data pipeline or EMR. CloudTrail logs include timestamps indicating when the action occurred, allowing you to track the exact time and date of the event. To view the actual scripts and schedules, you can review the pipeline definition or configuration. if the script was stored in a version control system, you can also check the repository directly. For further information you can refer Logging and Monitoring in AWS Data Pipeline and Logging Amazon EMR API calls in AWS CloudTrail.

I hope it helps.

profile pictureAWS
BezuW
已回答 5 個月前
AWS
支援工程師
已審閱 19 天前
  • Thank you!

4
已接受的答案

Hello,

As @BezuW mentioned, you can refer the CloudTrail API ActivatePipeline to check who trigger the pipeline that starts processing pipeline tasks. I presume it gives the user-id. You can run "aws iam list-users" command to find the IAM username or role to relate the actual IAM user.

On this date, you will no longer be able to access AWS Data Pipeline though the console. You will continue to have access to AWS Data Pipeline through the command line interface and API. Please note that AWS Data Pipeline service is in maintenance mode and we are not planning to expand the service to new regions. You are recommended to migrate if the workload can be leveraged using Glue or MWAA or step function. More details here.

You can use only AWS CLI command to check further,

To check particular pipeline,

aws datapipeline describe-pipelines --pipeline-ids df-0examplepipeline

To check the individual object in the given pipeline,

aws datapipeline describe-objects --pipeline-id <value>.--object-ids <value>

More datapipeline CLI command here

AWS
支援工程師
已回答 5 個月前
  • Thank you!

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南