How do I protect my Amplify-hosted website from the Bytespider bot?

0

I have a client's website hosted on Amplify, which has been a great hosting platform for the past few years.

Around the start of February, we noticed a significant increase in requests to the website, subsequently exceeding our request limits in certain third-party tools like Sentry and Weglot. After further investigation, we've found that these requests all had the same User Agent header of "Bytespider", which we've learned is a data scraper for ByteDance. A lot of forums have confirmed that it doesn't respect a disallow in robots.txt.

We do have an IP range (and a User Agent, obviously) that we can block, but unfortunately, as the website is on Amplify and not our own AWS setup, we can't access the CloudFront distribution or a WAF to implement rules to block these requests.

Does anyone know a way we can solve this issue?

1 Answer
0

Hello.

As of March 2024, it is not possible to directly set up AWS WAF on Amplify, so I think it is necessary to change the configuration to a configuration where a custom CloudFront is placed in front of Amplify and use AWS WAF.
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/enable-aws-waf-for-web-applications-hosted-by-aws-amplify.html

profile picture
EXPERT
answered 2 months ago
profile picture
EXPERT
reviewed a month ago
  • Is AWS WAF the only possible solution to this problem?

  • As far as I know, Amplify does not have a function to restrict IP etc., so I think it is necessary to use AWS WAF etc. to do so.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions