Need suggestions on how to look up existing objects on S3

0

Hi, we have incoming files everyday, and we upload them to S3. The files can be duplicates from earlier days, so we need to check if they already exist before uploading.

Our old way is to save the filename to SDB after uploading a new file, so we can use SDB query to look up existing files.

We recently want to change it to use S3 HEAD Object API to check existence.

We need suggestions:
(1) if there's better way beside SDB and S3 HEAD API? any new S3 API to check existence in bulk?
(3) is S3 HEAD API good enough for our use case (we need look up ~200 filenames every hour during the day)

Thanks in advance!

xpli
質問済み 5年前189ビュー
2回答
0
承認された回答

Hi,

I use S3 pretty extensively but I'm not employed by Amazon so take this as it is.

That amount of HEAD requests per hour will be totally fine, it should not be anywhere close to stressing out the service.

There isn't really a better way to check for existence of a random key. However, if you have a lot of keys at once and the keys share a path-like structure, you could do better by executing a list request on the common prefix and checking the contents.

Keep in mind that S3 may not be immediately consistent. You should read the docs to fully understand the impact to your particular use case, but a couple things stick out:

  1. if you do a GET/HEAD prior to uploading, then PUT, then GET/HEAD -- that response will be eventually consistent i.e. not guaranteed to return that the object does exist
  2. LIST requests are eventually consistent

Given that and that your expectations for number of objects to check, I would recommend keeping it simple and just doing HEAD - maybe with time-based retries to clear up the eventual consistency issue. If you make the wrong decision, you simply re-upload a duplicate, which is not data loss but just extra cost to you, so if this happens every once in a while, shouldn't be too big of a deal.

Hope this helps!

回答済み 5年前
0

Thank you. I agreed to keep it simple is the right way to start. Your suggestions are very much appreciated.

xpli
回答済み 5年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ