DataSync agent on EC2

0

In need to transfer files(NFS mount paths) from on-premise to AWS EFS. In the document, I saw an option of having the DataSync agent on EC2 machine, so can we ssh AWS EC2 machine at on-premise side and transfer the NFS files from on-prem to EFS? what could be the detailed steps involved in this?

3 Answers
0

Hi,

I am trying to invoke datasync agent with my terraform. the activation code is timing out error as follows: Error: retrieving activation key from IP Address (x.x.x.254): error making HTTP request: Get "http://x.x.x.254/?gatewayType=SYNC&activationRegion=eu-west-2&endpointType=PRIVATE_LINK&privateLinkEndpoint=x.x.x153": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

resource "aws_iam_instance_profile" "datasync-instance-profile" {
  name = "datasync-instance-profile-${var.datasync_agent["name"]}-${var.environment}"
  role = aws_iam_role.datasync-instance-role.name

  lifecycle {
    create_before_destroy = false
  }
}

resource "aws_instance" "datasync" {
  depends_on = [data.aws_subnets.private-subnets]
  ami                                  = data.aws_ami.datasync-agent-ami.id
  instance_type                        = var.ec2_inst_type
  instance_initiated_shutdown_behavior = "stop"

  disable_api_termination = false
  iam_instance_profile    = aws_iam_instance_profile.datasync-instance-profile.name

  vpc_security_group_ids      = ["${aws_security_group.datasync-instance-sg.id}"]
  subnet_id                   = data.aws_subnets.private-subnets.ids[0]
  associate_public_ip_address = false

  tags = {
    Name = "datasync-agent-instance-${var.datasync_agent["name"]}-${var.environment}",
    ami = "${data.aws_ami.datasync-agent-ami.id}"
  }
}

resource "aws_vpc_endpoint" "datasync-vpc-endpoint" {
  service_name       = "com.amazonaws.${data.aws_region.current.name}.datasync"
  vpc_id             = data.aws_vpc.vpc.id
  security_group_ids = [aws_security_group.datasync-instance-sg.id]
  subnet_ids         = [data.aws_subnets.private-subnets.ids[0]]
  vpc_endpoint_type  = "Interface"
  private_dns_enabled = true
}

resource "aws_datasync_agent" "datasync-agent" {
  depends_on = [aws_instance.datasync, aws_vpc_endpoint.datasync-vpc-endpoint]

  ip_address = "${aws_instance.datasync.private_ip}"
  security_group_arns   = [aws_security_group.datasync-instance-sg.arn]
  subnet_arns           = [local.subnet_arns[0]]
  vpc_endpoint_id       = aws_vpc_endpoint.datasync-vpc-endpoint.id
  private_link_endpoint = data.aws_network_interface.vpc-network-interface.private_ip
  name       = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}"

  lifecycle {
    create_before_destroy = false
  }

resource "aws_security_group" "datasync-instance-sg" {
  name        = "datasync-${var.datasync_agent["name"]}-${var.environment}"
  description = "Datasync Security Group - ${var.datasync_agent["name"]}-${var.environment}"
  vpc_id      = data.aws_vpc.vpc.id

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "SSH"
  }

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "HTTP"
  }

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "HTTPS"
  }

  ingress {
    from_port   = 1024
    to_port     = 1064
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "VPC endpoint"
  }

  egress {
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }

  tags = {
    Name = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}",
    env  = "${var.environment}"
  }
}

  tags = {
    Name = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}",
    env  = "${var.environment}"
  }
}
answered 5 months ago
0

thanks for the above. I have followed through the notes in those pages as you can see above. I wonder if I might be missing if anything on the IAM side. what permissions should I have on my datasync role for VPC connections. should the full read/write permissioning do? Also, I figured out for an s3 to s3 account within the same account, I only need a task. When is an agent required?

answered 5 months ago
0

Hi, In general it is recommended to deploy the AWS DataSync agent as close as possible to the source storage system to help minimize network latencies. You can deploy the agent as an Amazon EC2 instance with the following steps from the documentation.
https://docs.aws.amazon.com/datasync/latest/userguide/deploy-agents.html#ec2-deploy-agent

Once the agent is deployed you would configure an AWS DataSync task like a typical on-premises transfer. A thorough proof of concept is required to understand if this type of configuration meets your business needs with this configuration. https://docs.aws.amazon.com/datasync/latest/userguide/getting-started.html

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions