Best way to expose files from EFS over HTTP(S)?


I have some dynamically-generated files (more context below) stored on EFS and need to expose these files over HTTPS.
I'm wondering what the best way to do this would be…

I've thought of a few ideas, some might be doable and others might not, I'm curious to see what other people think:

  1. Setup a Cloudfront distribution and register my EFS as an Origin. This works fine for S3 but doesn't seem to be possible for EFS :-(
  2. Setup some replication mechanism that would upload files to S3 as soon as they are created in EFS. I haven't checked yet if EFS can generate an Event (maybe to EventBridge?) when a file has just been created, but if it can, plugging another Lambda to copy from EFS to S3 would work… Or maybe a managed service would be able to do that for me? (I don't really want to update my code to raise an event when a file has been generated, I'd rather have AWS generate that event automatically)
  3. Setup a Cloudfront -> API Gateway -> Lambda that would serve the file from EFS. Executing a lambda to serve a file is not optimal from a "cost" point of view, but those files could be cached by Cloudfront forever, making this approach OK-ish.

Does one of these approaches sound like what you would do? Do you have another idea / recommendation?


More context:

  • The files are created on EFS by a lambda function -- when that Lambda function is called, it downloads an image and generates a thumbnail. That thumbnail is stored, as a not-too-big file, on EFS.
  • If the Lambda was running my own code, I would change it to write the thumbnail to S3 (and set up a Cloudfront distribution to serve the thumbnails over HTTPS, idea #1). But this is not my code and I'm not too fond of modifying it…
  • When a thumbnail is generated, it needs to be available over HTTP quickly (delay of 1-5 seconds is Okay-ish, 1-5 minutes is not OK).
  • After a thumbnail has been generated, it is never updated. And thumbnails are rarely deleted (and keeping old "deleted" thumbnails for even days is OK)
  • Estimates: there will be between one and ten thousands thumbnails on EFS. Total size will be between 1 and 10 GB or so.
  • I expect only a few (a dozen, max) new thumbnails will be generated each day, which means a non-serverless and always-running approach will not be optimal from a "cost" point of view.
2 Answers

As you say, EFS is not ideal for this use case. EFS provides low-latency access to large volumes of information, but only through an otherwise-opaque NFS file system interface. It does not provide an HTTP API for its contents, nor does it produce lifecycle events for files in it.

I'd say your best bet is #2, replicating to S3. While it requires more modifications to your existing code, it has fewer moving parts overall and should be lower maintenance. You'll have to hook into your existing thumbnail generation process, but there are a few options for that. Without modifying your thumbnail generation Lambda function, probably the easiest is to add a Lambda destination onto the thumbnail generation function. You'll create a new Lambda function, which will get the input and output of the thumbnail generation function, and can pull the thumbnail from EFS and put it into Lambda.

answered 3 years ago
reviewed 3 years ago

AWS DataSync This becomes your replication mechanism.

In using a managed service to simplify the process of transferring data from an EFS filesystem to an S3 bucket. You can also schedule periodic replication of an EFS file system to an S3 bucket.

Note/Cost Optimization - VPC Endpoints for DataSync use AWS PrivateLink, when you use DataSync in a VPC the agent can communicate directly without the need to cross public internet.

answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions