- Newest
- Most votes
- Most comments
We recently found WAF is blocking googlebot with its AWSManagedIPReputationList
Using the IP list found here:
https://developers.google.com/search/docs/advanced/crawling/verifying-googlebot https://developers.google.com/search/apis/ipranges/googlebot.json
We implemented an ALLOW rule with these IP ranges. Not sure why AWSManagedIPReputationList added the googlebot IPs without doing proper investigation, will need to review the continued use of that if it keeps blocking legitimate IPs.
Cheers,
Chris
Hi,
I am not sure there is one recipe for every good bot, in case of google, they explain how to be sure it is googlebot and not a fake one:
1-Run a reverse DNS lookup on the accessing IP address from your logs, using the host command.
2-Verify that the domain name is in either googlebot.com or google.com
3-Run a forward DNS lookup on the domain name retrieved in step 1 using the host command on the retrieved domain name. Verify that it is the same as the original accessing IP address from your logs.
Example:
host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.
host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1
You will have to implement that with a lambda checking the logs I guess.
Regards
Relevant content
- asked 2 years ago
- asked 2 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 4 months ago