Need more guidance on how to check the data pipeline objects

1

Everyday a new emr cluster span up and terminated after completing the step job. Checking the cloudtrail, seems a Data Pipeline created it. I am not sure how to get more details like who created, what is that script, schedules etc. Importantly this involves additional unknown cost involved. Highly appreciate any assistance regarding this.

Vaas
gefragt vor 5 Monaten233 Aufrufe
2 Antworten
7

Hi,

Using CloudTrail you can get information about who generated the request. CloudTrail records events and actions related to the creation, modification, or execution of AWS Data Pipeline itself. The logs contain information about the user or role who initiated actions within AWS services like AWS Data pipeline and Amazon EMR. You can get the username, or role name associated with the user or service that initiated the action. This information helps identify who performed the specific operation within Data pipeline or EMR. CloudTrail logs include timestamps indicating when the action occurred, allowing you to track the exact time and date of the event. To view the actual scripts and schedules, you can review the pipeline definition or configuration. if the script was stored in a version control system, you can also check the repository directly. For further information you can refer Logging and Monitoring in AWS Data Pipeline and Logging Amazon EMR API calls in AWS CloudTrail.

I hope it helps.

profile pictureAWS
BezuW
beantwortet vor 5 Monaten
AWS
SUPPORT-TECHNIKER
überprüft vor 19 Tagen
  • Thank you!

4
Akzeptierte Antwort

Hello,

As @BezuW mentioned, you can refer the CloudTrail API ActivatePipeline to check who trigger the pipeline that starts processing pipeline tasks. I presume it gives the user-id. You can run "aws iam list-users" command to find the IAM username or role to relate the actual IAM user.

On this date, you will no longer be able to access AWS Data Pipeline though the console. You will continue to have access to AWS Data Pipeline through the command line interface and API. Please note that AWS Data Pipeline service is in maintenance mode and we are not planning to expand the service to new regions. You are recommended to migrate if the workload can be leveraged using Glue or MWAA or step function. More details here.

You can use only AWS CLI command to check further,

To check particular pipeline,

aws datapipeline describe-pipelines --pipeline-ids df-0examplepipeline

To check the individual object in the given pipeline,

aws datapipeline describe-objects --pipeline-id <value>.--object-ids <value>

More datapipeline CLI command here

AWS
SUPPORT-TECHNIKER
beantwortet vor 5 Monaten
  • Thank you!

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen