DataSync agent on EC2

0

In need to transfer files(NFS mount paths) from on-premise to AWS EFS. In the document, I saw an option of having the DataSync agent on EC2 machine, so can we ssh AWS EC2 machine at on-premise side and transfer the NFS files from on-prem to EFS? what could be the detailed steps involved in this?

已提問 2 年前檢視次數 1503 次
3 個答案
0

Hi,

I am trying to invoke datasync agent with my terraform. the activation code is timing out error as follows: Error: retrieving activation key from IP Address (x.x.x.254): error making HTTP request: Get "http://x.x.x.254/?gatewayType=SYNC&activationRegion=eu-west-2&endpointType=PRIVATE_LINK&privateLinkEndpoint=x.x.x153": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

resource "aws_iam_instance_profile" "datasync-instance-profile" {
  name = "datasync-instance-profile-${var.datasync_agent["name"]}-${var.environment}"
  role = aws_iam_role.datasync-instance-role.name

  lifecycle {
    create_before_destroy = false
  }
}

resource "aws_instance" "datasync" {
  depends_on = [data.aws_subnets.private-subnets]
  ami                                  = data.aws_ami.datasync-agent-ami.id
  instance_type                        = var.ec2_inst_type
  instance_initiated_shutdown_behavior = "stop"

  disable_api_termination = false
  iam_instance_profile    = aws_iam_instance_profile.datasync-instance-profile.name

  vpc_security_group_ids      = ["${aws_security_group.datasync-instance-sg.id}"]
  subnet_id                   = data.aws_subnets.private-subnets.ids[0]
  associate_public_ip_address = false

  tags = {
    Name = "datasync-agent-instance-${var.datasync_agent["name"]}-${var.environment}",
    ami = "${data.aws_ami.datasync-agent-ami.id}"
  }
}

resource "aws_vpc_endpoint" "datasync-vpc-endpoint" {
  service_name       = "com.amazonaws.${data.aws_region.current.name}.datasync"
  vpc_id             = data.aws_vpc.vpc.id
  security_group_ids = [aws_security_group.datasync-instance-sg.id]
  subnet_ids         = [data.aws_subnets.private-subnets.ids[0]]
  vpc_endpoint_type  = "Interface"
  private_dns_enabled = true
}

resource "aws_datasync_agent" "datasync-agent" {
  depends_on = [aws_instance.datasync, aws_vpc_endpoint.datasync-vpc-endpoint]

  ip_address = "${aws_instance.datasync.private_ip}"
  security_group_arns   = [aws_security_group.datasync-instance-sg.arn]
  subnet_arns           = [local.subnet_arns[0]]
  vpc_endpoint_id       = aws_vpc_endpoint.datasync-vpc-endpoint.id
  private_link_endpoint = data.aws_network_interface.vpc-network-interface.private_ip
  name       = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}"

  lifecycle {
    create_before_destroy = false
  }

resource "aws_security_group" "datasync-instance-sg" {
  name        = "datasync-${var.datasync_agent["name"]}-${var.environment}"
  description = "Datasync Security Group - ${var.datasync_agent["name"]}-${var.environment}"
  vpc_id      = data.aws_vpc.vpc.id

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "SSH"
  }

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "HTTP"
  }

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "HTTPS"
  }

  ingress {
    from_port   = 1024
    to_port     = 1064
    protocol    = "tcp"
    cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
    description = "VPC endpoint"
  }

  egress {
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }

  tags = {
    Name = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}",
    env  = "${var.environment}"
  }
}

  tags = {
    Name = "datasync-agent-${var.datasync_agent["name"]}-${var.environment}",
    env  = "${var.environment}"
  }
}
已回答 10 個月前
0

thanks for the above. I have followed through the notes in those pages as you can see above. I wonder if I might be missing if anything on the IAM side. what permissions should I have on my datasync role for VPC connections. should the full read/write permissioning do? Also, I figured out for an s3 to s3 account within the same account, I only need a task. When is an agent required?

已回答 10 個月前
0

Hi, In general it is recommended to deploy the AWS DataSync agent as close as possible to the source storage system to help minimize network latencies. You can deploy the agent as an Amazon EC2 instance with the following steps from the documentation.
https://docs.aws.amazon.com/datasync/latest/userguide/deploy-agents.html#ec2-deploy-agent

Once the agent is deployed you would configure an AWS DataSync task like a typical on-premises transfer. A thorough proof of concept is required to understand if this type of configuration meets your business needs with this configuration. https://docs.aws.amazon.com/datasync/latest/userguide/getting-started.html

AWS
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南