By using AWS re:Post, you agree to the AWS re:Post Terms of Use

How do I allow requests from a bot that's blocked by an AWS WAF Bot Control rule group?

7 minute read
1

I want to allow requests from a legitimate bot that's blocked by an AWS WAF Bot Control rule group.

Resolution

To allow requests from a legitimate bot blocked by a Bot Control rule group, complete the following steps.

Note: The AWS WAF Bot Control rule group uses IP addresses from AWS WAF to verify bots. Some verified bots route through a proxy or a CDN that doesn't preserve the client IP address when it forwards requests. If your bot routes this way, then you must specifically allow that bot.

Identify the Bot Control rule that blocks the requests from your AWS WAF logs

To identify the rule that blocks the legitimate bot request, analyze your AWS WAF logs. Use an Amazon Athena query or Amazon CloudWatch Log Insights.

Use an Athena query to analyze AWS WAF logs

Complete the following steps:

  1. Use partition projection to create a table for AWS WAF logs in Athena.
  2. To find the blocked request details, run the following Athena query:
    WITH waf_data AS
    (SELECT from_unixtime(waf.timestamp / 1000) as time,
    waf.terminatingRuleId,
    waf.action,
    waf.httprequest.clientip as clientip,
    waf.httprequest.requestid as requestid,
    waf.httprequest.country as country,
    rulegroup.terminatingrule.ruleid as matchedRule,
    labels as Labels,
    map_agg(LOWER(f.name),
    f.value) AS kv
    FROM waf_logs waf,
    UNNEST(waf.httprequest.headers)
    AS t(f), UNNEST(waf.rulegrouplist) AS t(rulegroup)
    WHERE rulegroup.terminatingrule.ruleid IS NOT NULL
    GROUP BY 1, 2, 3, 4, 5, 6, 7,8)
    SELECT waf_data.time,
    waf_data.action,
    waf_data.terminatingRuleId,
    waf_data.matchedRule,
    waf_data.kv['user-agent'] as UserAgent,
    waf_data.kv['user-agent'] like 'pingdom%',
    waf_data.clientip,
    waf_data.country,
    waf_data.Labels
    FROM waf_data
    Where terminatingRuleId='AWS-AWSManagedRulesBotControlRuleSet'
    and time > now() - interval '3' day
    ORDER BY time
    DESC 
    Note: Replace waf_logs with your table name, waf_data with your database name, time > now() - interval '3' day with your time range, and pingdom% with your bot name.
    For example Athena queries that filter records for a specified time range, see Example queries for AWS WAF logs.
  3. Check the matchedRule column to identify the rule that blocks legitimate bot requests. The following is the example output of the Athena query from step 2:
    #: 1
    timestamp: 2024-04-10 15:11:18.000
    Action**:** BLOCK
    terminatingRuleId: AWS-AWSManagedRulesBotControlRuleSet
    matchedRule: CategoryMonitoring
    UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/61.0.3163.100 Chrome/61.0.3163.100 Safari/537.36 PingdomPageSpeed/1.0 (pingbot/2.0; +http://www.pingdom.com/)
    _col5: false
    clientip: 192.0.2.0
    country: US
    labels: [{name=awswaf:managed:token:absent}, {name=awswaf:managed:aws:bot-control:bot:name:pingdom}, {name=awswaf:managed:aws:bot-control:bot:unverified}, {name=awswaf:managed:aws:bot-control:bot:category:monitoring},{name=awswaf:managed:captcha:absent}, {name=awswaf:managed:aws:bot-control:signal:non_browser_user_agent}]
    Note: The Athena query output appears as a table.

Use a CloudWatch Log Insights query

Complete the following steps:

  1. Create a CloudWatch Logs log group.

  2. Open the AWS WAF console, and then choose the AWS Region where you created your web ACL.

  3. Choose Web ACLs, and then select your web ACL.

  4. Choose CloudWatch Log Insights, and then paste the following query into the Query Editor:

    fields @timestamp, @message
    | parse @message '{"name":"User-Agent","value":"*"}' as userAgent
    | filter @message like 'awswaf:managed:aws:bot-control'
    | parse @message '"labels":[*]' as Labels
    | parse @message '"httpRequest":{"clientIp":"*","country":"*"' as IP, Country
    | parse @message '"terminatingRule":{"ruleId":"*","action":"*"' as ruleID, action
    | filter action = "BLOCK"
    | display @timestamp, userAgent, IP, Country, ruleID, action, terminatingRuleId, Labels
    | sort by @timestamp DESC
    | limit 100
  5. (Optional) To find a bot from a specific IP address, add the following filter to the preceding query:

    | filter IP = '10.0.0.0/8'

    Note: Replace 10.0.0.0/8 with your IP address.

  6. (Optional) To find a bot with a specific user agent value, add the following filter to the preceding query:

    | filter userAgent like 'user-agent value'

    Note: Replace user-agent value with your user agent value.

  7. To see the whole log message, expand the log entries.

  8. To find the rule that blocks legitimate bot requests, search for the terminatingRuleId and ruleGroupList.x.terminatingRule.ruleId. The labels attached to these fields show the reason why the request was blocked. In the following example, ruleGroupList.0.terminatingRule.ruleID is blocked because of CategoryMonitoring:

    httpSourceName                             ALB
    labels.0.name                              awswaf:managed:token:absent
    labels.1.name                              awswaf:managed:aws:bot-control:bot:name:pingdom
    labels.2.name                              awswaf:managed:aws:bot-control:bot:unverified
    labels.3.name                              awswaf:managed:aws:bot-control:bot:category:monitoring
    labels.4.name                              awswaf:managed:captcha:absent
    ruleGroupList.0.customerConfig.0.name      InspectionLevel
    ruleGroupList.0.customerConfig.0.value     COMMON
    ruleGroupList.0.customerConfig.1.name      EnableMachineLearning
    ruleGroupList.0.customerConfig.1.value     null
    ruleGroupList.0.ruleGroupId                AWS#AWSManagedRulesBotControlRuleSet
    ruleGroupList.0.terminatingRule.action     BLOCK
    ruleGroupList.0.terminatingRule.ruleId     CategoryMonitoring
    terminatingRuleId                          AWS-AWSManagedRulesBotControlRuleSet
    terminatingRuleType                        MANAGED_RULE_GROUP

    A request can have multiple labels attached. For the purpose of these instructions, only the requests related to block rules are relevant.

Set the Bot Control rule that blocks the requests to count

Complete the following steps:

  1. Open the AWS WAF console, and then choose the Region where you created your web ACL.
  2. Choose Web ACLs, and then select your web ACL.
  3. For Rules, then select the AWSManagedRulesBotControlRuleSet rule group, and finally choose Edit.
  4. Select the rule that blocks legitimate bot requests, and then set it to Override to Count.
  5. Choose Save. Now, AWS WAF verifies and attaches labels to requests that match the rule, but doesn't block them.

Create a custom rule to all matching requests except for the bot that you want to allow

To block illegitimate requests, create a rule that matches labels to allow requests with specific characteristics and blocks other requests. For example:

Name: name of the rule
 IF (Statement 1):the request contains the label 
   AND
NOT
 IF (Statement 2):the request contains the label(enter the bot label you want to allow) 
Then Action: Block

Note: If the rule that blocks has the awswaf:managed:aws:bot-control:bot:category label, then Bot Control attaches another label to the bot name. You can use this bot name label to identify legitimate bots. Otherwise, use the user-agent header to identify legitimate bots.

To create a custom rule, complete the following steps:

  1. Open the AWS WAF console.
  2. Choose Web ACLs, and then select your web ACL.
  3. For Rules, choose Add Rules, and then choose Add my own rules and rule group.
  4. Configure your rule with the following settings:
    Rule type to Rule Builder.
    Rule name to your rule name.
    Type to Regular rule.
  5. To create your rule parameters, set If a request to matches all the statements (AND). Then, create the following statements:
    Configure your first statement with the following settings:
    Inspect to Has a label.Match scope to Label.
    Match key to awswaf:managed:aws:bot-control:bot:category: category-name. Note: Replace category-name with your rule category name.
    Configure your second statement with the following settings:
    Under Results, turn on Negate statement.
    Inspect to Has a label.
    Match scope to Label.
    Match key to awswaf:managed:aws:bot-control:bot:name: bot-name. Note: Replace bot-name with your bot name.
  6. For Action, choose Block.
  7. Choose Add rule.
  8. For Set rule priority, update the custom rule's priority to a lower priority than the Bot Control rule.
  9. Choose Save.

The following example is a custom rule that uses labels to allow bots legitimate bots and blocks all other bots:

{
  "Name": "expection-rule",
  "Priority": 2,
  "Statement": {
    "AndStatement": {
      "Statements": [
        {
          "LabelMatchStatement": {
            "Scope": "LABEL",
            "Key": "awswaf:managed:aws:bot-control:bot:category:monitoring"
          }
        },
        {
          "NotStatement": {
            "Statement": {
              "LabelMatchStatement": {
                "Scope": "LABEL",
                "Key": "awswaf:managed:aws:bot-control:bot:name:pingdom"
              }
            }
          }
        }
      ]
    }
  },
  "Action": {
    "Block": {}
  },
  "VisibilityConfig": {
    "SampledRequestsEnabled": true,
    "CloudWatchMetricsEnabled": true,
    "MetricName": "expection-rule"
  }
}

Validate that AWS WAF allows legitimate bot traffic

To verify that AWS WAF allows legitimate bot track, check the AWS WAF logs again. If the bot is still blocked, then repeat the preceding process with additional rules. Sometimes, more than one rule blocks legitimate traffic. In this case, you must repeat the preceding process multiple times.

Related information

Bot Control example: Allow a specific blocked bot

Bot Control example: Create an exception for a blocked user agent

Log fields

Managing rule group behavior in a web ACL

AWS OFFICIAL
AWS OFFICIALUpdated 7 months ago