Basic S3 Security for Web App

0

I have an S3 bucket on US-East that is public - it hosts a Web site that includes HTML and Javascript functions that run in the browser.

I want some of these Javascript functions to send data to a different S3 bucket that will accept this data (essentially CSV data or JSON) and store it. It's essentially custom analytics from the users actions.

What I don't want is for any other person to be able to send data to that bucket except the application itself as hosted on the original S3 instance.

Since the data will not be readable, just writeable my only concern is not theft of data (it's meaningless analytics) but instead purposeful pollution of data by malicious people.

I'm unsure how to do this. In general I'm assuming the following:

The source to my web app will be viewable as all source is and thus the address of the analytics bucket is exposed, and perhaps even passwords. However, I'm also assuming that the receiving bucket itself is configured such that it properly rejects any effort to send data to it that's not coming from code hosted on the host S3 bucket. I also want to assume it's doing this in a way where easy header spoofs, etc. can't bypass the write access permissions on the S3 bucket.

Any thoughts on where to read how to do this? Everything I've seen so far is not hitting the nail on the head such that I understand it and find it fairly infallible.

2 Answers
2

In order to secure S3 access from a frontend web application you really need some type of server-side authentication. As you astutely point out, there's no way to really "hide" any credentials that might be used to authenticate your application and thereby place a limit who or what can write to your S3 bucket. You can't even really use IP address because it's FE, and the IP address will just represent the computer where the browser is running. Because you can't really establish a reliable identity (or principal if we're being technical) the only way to write to the bucket directly from your webapp is to make it public. Once it's public, there's not really an effective way to limit "who" writes to it.

So I think the solution is to make the bucket NOT public and limit "who" can access it by introducing server-side authentication. One way to do this would be to require logins to your site with something like Cognito. Then you'd create an API Gateway backed by a Lambda to accept the JSON and/or CSV payloads and write them to the S3 bucket. You would lock down the S3 bucket so that only the lambda could write to it, and you'd add a Cognito authorizer to the API Gateway to make sure the data that's coming in is generated by an actual user on the website.

The real sort of philosophical issue here is that it seems like you are not interested in the users, only in the website. But you can't really authenticate a FE website automatically without exposing your credentials, so you need some form of user interaction. An acceptable solution might be to piggy-back on user-logins via Cognito as described above.

profile pictureAWS
answered 2 years ago
profile pictureAWS
EXPERT
kentrad
reviewed 2 years ago
  • This makes sense. You see this same problem with Google Analytics which is widely used on open frontend sites with no authentication -- people report polluted data that has made it into their GA Analytics logs. I wasn't sure if there was a way to otherwise lockdown the receiving bucket such that it wouldn't accept data that came from elsewhere but I suspect it's still too easy to spoof something where there isn't otherwise some sort of credentialed user.

  • Yes -- FE is very difficult to secure it turns out! If you found this answer helpful it would really benefit me if you'd mark it as "Accepted"!

0

Hey Ben,

I completely get your concern about protecting your analytics bucket from potential data pollution. It's crucial to maintain data integrity in these scenarios. To address this, you can employ a combination of AWS services and IAM policies to tighten the security around your S3 buckets.

Firstly, ensure your analytics bucket has a strict bucket policy that only allows write access from your original S3 instance's IAM role. Create a specific IAM role for your application, grant it permissions to write to the analytics bucket, and attach this role to the EC2 instance or Lambda function running your web app. This way, only your app can write data to the bucket.

Additionally, you can employ AWS Lambda triggers to validate incoming data before allowing it to be written to the analytics bucket. For example, you could run a Lambda function that checks the incoming data against predefined criteria to filter out any potentially malicious input.

In my experience, this approach provides robust protection against unauthorized data pollution while allowing your web app to function smoothly. Make sure to monitor your AWS CloudWatch logs for any suspicious activities and regularly review and update your security configurations.

I hope this helps you secure your S3 bucket effectively!

Valer
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions