How do I protect my Amplify-hosted website from the Bytespider bot?

0

I have a client's website hosted on Amplify, which has been a great hosting platform for the past few years.

Around the start of February, we noticed a significant increase in requests to the website, subsequently exceeding our request limits in certain third-party tools like Sentry and Weglot. After further investigation, we've found that these requests all had the same User Agent header of "Bytespider", which we've learned is a data scraper for ByteDance. A lot of forums have confirmed that it doesn't respect a disallow in robots.txt.

We do have an IP range (and a User Agent, obviously) that we can block, but unfortunately, as the website is on Amplify and not our own AWS setup, we can't access the CloudFront distribution or a WAF to implement rules to block these requests.

Does anyone know a way we can solve this issue?

1 Antwort
0

Hello.

As of March 2024, it is not possible to directly set up AWS WAF on Amplify, so I think it is necessary to change the configuration to a configuration where a custom CloudFront is placed in front of Amplify and use AWS WAF.
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/enable-aws-waf-for-web-applications-hosted-by-aws-amplify.html

profile picture
EXPERTE
beantwortet vor 2 Monaten
profile picture
EXPERTE
überprüft vor 2 Monaten
  • Is AWS WAF the only possible solution to this problem?

  • As far as I know, Amplify does not have a function to restrict IP etc., so I think it is necessary to use AWS WAF etc. to do so.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen