- Newest
- Most votes
- Most comments
All major search engines respect the restrictions you declare in your robots.txt file. However, it can take a while for them to discover it, and it may take weeks or months for them to remove the content that was crawled previously. You can use features like Google's Search Console (https://search.google.com/search-console/about) and Bing's Content Removal Tool (https://www.bing.com/webmasters/help/content-removal-broken-links-or-outdated-cache-cb6c294d) to ask them explicitly to remove addresses that you don't want them to show, but it may still take weeks or months for the content to stop showing entirely.
Technically, you could use Web Application Firewall (WAFv2) and its Bot Control feature (https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-bot.html) to identify and filter out requests from known search engines, but in addition to WAF inspection and the separately charged Bot Control feature both costing extra (https://aws.amazon.com/waf/pricing/), blocking search engines this way will likely be even slower in getting your content removed from the search index. When WAF is blocking search engine crawlers, it wouldn't be a clear indication of whether the site owner wanted the content removed or if a technical error is preventing it from being re-indexed.
Relevant content
- asked 3 years ago
