How to sync bidirectionally both s3 bucket and EC2 server?

0

Hello team,

I have a wordpress application which is deployed in Elastic Beanstalk and static content is delivered by S3 bucket whereas dynamic content has been delivered by EC2 server.

My project has a folder wp-content/uploads/ in root directory. This uploads folder is sync with S3 bucket using the script which is placed in .ebextension folder.

Below one is .ebextension script which is going to add in crontab in EC2 server.

commands:
  01_set_cron_job:
    command: |
      sudo crontab -l | { cat; echo "*/1 * * * * aws s3 sync --delete --size-only s3://mydemobucket/wp-content/uploads/ /var/app/current/wp-content/uploads/"; } | sudo crontab -

Whenever new EC2 instance has been launched in my Beanstalk Autoscaling environment, then all the objects are available in S3 bucket is sync with newly launched EC2 instance.

This direction is perfectly working for me.

My problem, that If my developers some images to the uploads folder through wordpress application, then how I copy the updated images from EC2 server to S3 bucket. I just tried and little bit confused.

Finally, I can able to sync from S3 to Ec2 server with the help of crontab. This is one direction that is from S3 to EC2 server.

I would like to copy the images from EC2 server to S3 bucket whenever some changes or new images are updated in EC2 server. This is one direction that is from EC2 to S3 bucket.

I need bi-directional sync without any impact. Can any one please help me to achieve this issues?

asked 2 months ago160 views
3 Answers
1

Have you considered using Amazon EFS instead of EBS if you need shared storage accessible from multiple EC2 instances? Amazon EFS can be mounted on multiple instances and integrated with services like DataSync for bidirectional sync with S3. Use AWS DataSync to configure ongoing, automated sync between an EFS file system and S3. AWS DataSync can sync in both directions based on filters you define.

https://docs.aws.amazon.com/efs/latest/ug/trnsfr-data-using-datasync.html

profile pictureAWS
Anand
answered 2 months ago
  • Yes, We can use If I need to share the files between multiple instances. For my cases, im too using Beanstallk with ASG. However, usually i have to go with only one instance rather than multiple EC2 instances. There is no too much visitors to give much loads.

1

Mountpoint for S3 can mount an Amazon S3 bucket as a local file system in your EC2 instance. It supports basic file system operations, list and read existing files, and create new ones. It cannot modify existing files or delete directories, and it does not support symbolic links or file locking. Mountpoint is ideal for applications that do not need all of the features of a shared file system and POSIX-style permissions but require Amazon S3's elastic throughput to read and write large S3 datasets. Refer to Working with Mountpoint for Amazon S3 for details

If you need a POSIX-style storage, consider EFS. Refer to How can I mount an Amazon EFS volume to an instance in my Elastic Beanstalk environment? for overview.

AWS
EXPERT
Mike_L
answered 2 months ago
  • Yes Mike, I would use the S3 bucket rather than EFS.

0

Running aws s3 sync --delete --size-only s3://mydemobucket/wp-content/uploads/ /var/app/current/wp-content/uploads/ every minute using cron is effectively updating the contents of the uploads directory on the EC2 to match the uploads prefix in the bucket, and the --delete flag means delete anything in the EC2 directory that isn't already in the bucket https://docs.aws.amazon.com/cli/latest/userguide/cli-services-s3-commands.html#using-s3-commands-managing-objects-sync

When one of your developers uploads a file to the directory on the EC2, the next time the cron job runs (which will be within a minute) this new file will be deleted because of the --delete flag.

You may be able to "beat" this by using something like incron to kick-off a new sync job any time a change is made on the EC2 and sync that file to the bucket. But that's not 100% guaranteed to happen every time.

As other answers have already said, there are other AWS services that are probably a better fit for your use case.

profile picture
EXPERT
Steve_M
answered 2 months ago
  • Hi steve, Thanks for the prompt response. I did something. Can you please check below commands? Is it work as I expected?

    */3 * * * * aws s3 sync --delete --size-only s3://examplebucket/wp-content/uploads/ /var/app/current/wp-content/uploads/ */1 * * * * aws s3 sync --size-only /var/app/current/wp-content/uploads/ s3://examplebucket/wp-content/uploads/

  • I don't know what you're expecting it to do.

    I see what it's trying to do - every minute it is sync-ing the EC2 uploads directory to the uploads folder in the bucket (and if it finds an object in S3 that is not on EC2 then leave it there). And every three minutes another job syncs the bucket with the directory on EC2 (this time, if a file exists on EC2 that is not in S3 then delete it).

    This will more-or-less close the gap, but there are still going to be edge cases like when your developer writes a file to the local directory at (for instance) 12:03:00.5 (half a second past 12:03). At this time the first job (EC2 to S3) may have already completed before the file was uploaded. The second job (S3 to EC2) may still be running, and the new file will be found by that and will be deleted.

    You could use an event-based daemon like incrond to handle a new file appearing in the local directory, and something using S3 Events to handle a new object in the bucket (e.g. an SQS queue as the destination, and have the EC2 listening for new messages on the queue, process the message and sync the new object as it appears).

    It might be worth revisiting the need to run aws s3 sync with the --delete flag in the first place.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions