How do I protect my Amplify-hosted website from the Bytespider bot?

0

I have a client's website hosted on Amplify, which has been a great hosting platform for the past few years.

Around the start of February, we noticed a significant increase in requests to the website, subsequently exceeding our request limits in certain third-party tools like Sentry and Weglot. After further investigation, we've found that these requests all had the same User Agent header of "Bytespider", which we've learned is a data scraper for ByteDance. A lot of forums have confirmed that it doesn't respect a disallow in robots.txt.

We do have an IP range (and a User Agent, obviously) that we can block, but unfortunately, as the website is on Amplify and not our own AWS setup, we can't access the CloudFront distribution or a WAF to implement rules to block these requests.

Does anyone know a way we can solve this issue?

已提问 2 个月前427 查看次数
1 回答
0

Hello.

As of March 2024, it is not possible to directly set up AWS WAF on Amplify, so I think it is necessary to change the configuration to a configuration where a custom CloudFront is placed in front of Amplify and use AWS WAF.
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/enable-aws-waf-for-web-applications-hosted-by-aws-amplify.html

profile picture
专家
已回答 2 个月前
profile picture
专家
已审核 2 个月前
  • Is AWS WAF the only possible solution to this problem?

  • As far as I know, Amplify does not have a function to restrict IP etc., so I think it is necessary to use AWS WAF etc. to do so.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容