Skip to content

Cloudfront domain publicly accessible

0

Our cloudfront domain name publicly searchable in search engine. Note cloudfront configured with edge lambda to fetch the images and infront publicly hosted route53.

Would like to stop this display in search engine, tried with robots diallowa no luck. Would like to know any other access restriction we can achieve?

asked a year ago250 views
1 Answer
0

All major search engines respect the restrictions you declare in your robots.txt file. However, it can take a while for them to discover it, and it may take weeks or months for them to remove the content that was crawled previously. You can use features like Google's Search Console (https://search.google.com/search-console/about) and Bing's Content Removal Tool (https://www.bing.com/webmasters/help/content-removal-broken-links-or-outdated-cache-cb6c294d) to ask them explicitly to remove addresses that you don't want them to show, but it may still take weeks or months for the content to stop showing entirely.

Technically, you could use Web Application Firewall (WAFv2) and its Bot Control feature (https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-bot.html) to identify and filter out requests from known search engines, but in addition to WAF inspection and the separately charged Bot Control feature both costing extra (https://aws.amazon.com/waf/pricing/), blocking search engines this way will likely be even slower in getting your content removed from the search index. When WAF is blocking search engine crawlers, it wouldn't be a clear indication of whether the site owner wanted the content removed or if a technical error is preventing it from being re-indexed.

EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.