Unable to validate an instance profile with the role DataPipelineDefault

0

Hi,
I am facing a weird issue while trying to set up a DataPipeline via Cloudformation.

The Cloudformation yaml file is used to create the two needed Roles ( DataPipelineDefaultRole and DataPipelineDefaultResourceRole ) and its DataPipeline as described in the AWS doc : https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-datapipeline-pipeline.html

I am using exactly that example including the creation of two Roles by strictly following this AWS tutorial : https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-iam-roles.html

To make it short:

If I create the two roles via AWS Web Console and then run the CloudFormation process, everything works as expected (datapipeline and all needed resources are properly created).

But if I try to include the creation of the roles into the CloudFormation file and skip the Web Console, then I get the below error :

Pipeline Definition failed to validate because of following Errors: [{ObjectId = 'EmrClusterForBackup', errors = [Unable to validate an instance profile with the role name'DataPipelineDefaultResourceRole'.Please create an EC2 instance profile with the same name as your resource role]}] and Warnings: [{ObjectId = 'Default', warnings = ['pipelineLogUri'is missing. It is recommended to set this value on Default object for better troubleshooting.]}]

So, I have spent hours today trying to debug this issue and can guarantee that the generated Roles are identical either using the Web Console or the CloudFormation definition. I have extracted their json definition via iam get-role command in both cases and they are indeed the same.

Can someone help out here ?
Best,
M.

Edited by: tundraspar on Feb 6, 2019 1:22 PM

asked 5 years ago830 views
2 Answers
0

just compared the Roles details and noticed the one created via CF automation has an extra line (Sid:) which is empty anyway:

Role generated via Web Console

{
    "Role": {
        "RoleName": "DataPipelineDefaultRole",
        "CreateDate": "2019-02-06T17:22:13Z",
        "RoleId": "AROAI2B7HMTSEAUOGJOK4",
        "Path": "/",
        "Arn": "arn:aws:iam::429416768433:role/DataPipelineDefaultRole",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17"
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": [
                            "elasticmapreduce.amazonaws.com",
                            "datapipeline.amazonaws.com"
                        ]
                    },
                    "Action": "sts:AssumeRole"
                }
            ],
        },
    }
}

Role generated via CF

{
    "Role": {
        "RoleName": "DataPipelineDefaultRole",
        "CreateDate": "2019-02-06T17:46:25Z",
        "RoleId": "AROAJGHEOSAQTO6DWRNWY",
        "Path": "/",
        "Arn": "arn:aws:iam::429416768433:role/DataPipelineDefaultRole",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": [
                            "datapipeline.amazonaws.com",
                            "elasticmapreduce.amazonaws.com"
                        ]
                    },
                    "Action": "sts:AssumeRole",
                    "Sid": ""
                }
            ]
        }
    }
}

Can it interfere somehow ?

answered 5 years ago
0

After having spent hours on this, I found out that there is a need to create the instanceProfile (manually or via CM if you use any automation tools like ansible, terraform or chef).

The AWS documentation was a bit misleading as the Emr cluster definition field specifies to provide the resourceRole whereas the instanceProfile previously created was meant to be set there.

Here is my terraform procedure :

data "aws_iam_policy_document" "ec2_assume_role" {
  statement {
    effect = "Allow"
    principals {
      type        = "Service"
      identifiers = ["ec2.amazonaws.com","datapipeline.amazonaws.com","elasticmapreduce.amazonaws.com"]
    }
    actions = ["sts:AssumeRole"]
  }
}

resource "aws_iam_role" "emr_ec2_instance_profile" {
  name               = "MyInstanceProfile"
  assume_role_policy = "${data.aws_iam_policy_document.ec2_assume_role.json}"
}

resource "aws_iam_role_policy_attachment" "emr_ec2_instance_profile1" {
  role       = "${aws_iam_role.emr_ec2_instance_profile.name}"
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"
}

resource "aws_iam_role_policy_attachment" "emr_ec2_instance_profile2" {
  role       = "${aws_iam_role.emr_ec2_instance_profile.name}"
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforDataPipelineRole"
}

resource "aws_iam_instance_profile" "emr_ec2_instance_profile" {
  name = "${aws_iam_role.emr_ec2_instance_profile.name}"
  role = "${aws_iam_role.emr_ec2_instance_profile.name}"
}

In short :

  • create the Policy Document
  • a IAM Role
  • Two Policies attachment MapReduce and DataPipeline (perhaps the first one not needed though)
  • The most important => attach them together with the instanceProfile

Let me know if you need more help or get stuck
Hope it helps!
Best

answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions

Relevant content