Skip to content

Implementing Cross-Region Inference with Amazon Bedrock while Maintaining Your Landing Zone Structure

10 minute read
Content level: Expert
3

Learn how to implement Amazon Bedrock cross-region inference while maintaining your AWS Landing Zone structure

Authors: Arlind Nocaj (arlnocaj@amazon.ch) and Markus Rollwagen (rollwag@amazon.ch)

As organizations increasingly adopt AI capabilities within their cloud infrastructure, managing cross-region inference while maintaining landing zone governance controls presents a unique challenge. This is particularly crucial for enterprises operating under specific data residency requirements or those with established regional operational boundaries. In this post, we'll demonstrate a practical approach that enables organizations to leverage Amazon Bedrock's foundation models while minimizing impact on your landing zone governance and operational control.

This guidance addresses several critical needs:

  • Preserving existing governance controls while expanding AI capabilities
  • Managing traffic spikes through multi-region inference distribution
  • Centralizing CloudTrail logs and management in the existing source region
  • Implementing precise access controls through IAM policies

This approach is especially beneficial for organizations with specific regional requirements, such as EU-based companies that must maintain their AI operations within European regions while maximizing the benefits of cross-region inference capabilities. Let's say all of the workloads are running in two specific regions, e.g, us-east-1 and eu-central-1, but we want to increase throughput and performance by enabling cross-region inference on AWS.

Through careful configuration of Service Control Policies (SCPs) and IAM roles, we'll show you how to create secure pathways for Amazon Bedrock's cross-region inference while preserving your regional AWS Organization's governance strategy. For general information about how cross-region inference works, see: Getting Started with Cross-region inference on Amazon Bedrock.

Approach Overview

Let's explore how to implement this solution within your existing AWS Landing Zone in 4 steps.

  1. Review the existing Landing Zone
  2. Solution: Extending the Landing Zone
    1. Ensure Role Permissions
    2. Ensure Model Access in the source region
    3. Extend your existing Service Control Policies (SCPs)to enable cross-region usage
  3. Run Inference
  4. Observe the CloudTrail events

1. Review Existing Landing Zone

This guide assumes you already have a Landing Zone on AWS, implemented through AWS Control Tower or AWS Organizations. As part of such a landing zone configuration, you might have a Service Control Policy (SCP), that restricts your organization to specific AWS regions.

For example your organization might use an SCP like the one from the AWS Organizations documentation that allows access only to the US East (N. Virginia) and Europe (Frankfurt) regions for non-global services.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyAllOutsideEU",
      "Effect": "Deny",
      "NotAction": [
        "cloudfront:*",       
        "route53:*",
        ...
        "trustedadvisor:*",
        "waf-regional:*",
        "waf:*",
        "wafv2:*",
        "wellarchitected:*"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": [
            "eu-central-1",
            "us-east-1"
          ]
        },
        "ArnNotLike": {
          "aws:PrincipalARN": [
            "arn:aws:iam::*:role/`Role1AllowedToBypassThisSCP`",
            "arn:aws:iam::*:role/`Role1AllowedToBypassThisSCP`"
          ]
        }
      }
    }
  ]
}

Let’s explore to use cross-region inference for eu.amazon.nova-pro-v1:0, where the prefix “eu” indicates that inference can utilize a fixed set of EU regions. At the time of writing, the Nova models are available over cross-region inference as shown in the AWS Management Console.

Nova models available for cross-region inference and model access enabled

We can try to run an inference request using the Amazon Bedrock Converse API

aws bedrock-runtime converse \
--model-id eu.amazon.nova-pro-v1:0 \
--messages '[{"role": "user", "content": [{"text": "Describe the purpose of a \"hello world\" program in one line."}]}]' 

We see that the existing governance and SCP correctly denies this type of request with the following error message:

An error occurred (AccessDeniedException) when calling the Converse operation: User: arn:aws:sts::***:assumed-role/YourAssumedRole/username is not authorized to perform: bedrock:InvokeModel on resource: arn:aws:bedrock:eu-west-3::foundation-model/amazon.nova-pro-v1:0 with an explicit deny in a service control policy

2. Solution: Extending the Landing Zone

To enable cross-region inference from a source region such as Europe Frankfurt (eu-central-1) without having to set up and manage governance for an additional region, we perform the following steps:

a) Ensure your IAM Role has the permissions to run cross-region inference

For this solution, we use eu-central-1 (Frankfurt) as the source region to utilize the EU region group. The supported cross-region inference profiles page provides a list of the supported models and region groups. The figure below provides a high level overview at the time of writing.

Overview of Amazon Bedrock regions and Cross-Region inference profiles

You can either use the AWS managed policy AmazonBedrockFullAccess or create custom permissions for your role as described in prerequisites for inference profiles to follow the least-privilege best practice.

Example 1: This policy allows a role to invoke the Amazon Nova Pro v1 model only through the eu.amazon.nova-pro-v1:0 inference profile in AWS account 111122223333 in the Europe Frankfurt Region (eu-central-1):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:eu-central-1:111122223333:inference-profile/eu.amazon.nova-pro-v1:0"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:eu-north-1::foundation-model/amazon.nova-pro-v1:0",
                "arn:aws:bedrock:eu-west-1::foundation-model/amazon.nova-pro-v1:0",  
                "arn:aws:bedrock:eu-west-3::foundation-model/amazon.nova-pro-v1:0",  
                "arn:aws:bedrock:eu-central-1::foundation-model/amazon.nova-pro-v1:0"
            ],
            "Condition": {
                "StringLike": {
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:eu-central-1:111122223333:inference-profile/eu.amazon.nova-pro-v1:0"
                }
            }
        }
    ]
}

Example 2: The following policy allows a role to invoke all the enabled models in the source region, which support cross- region inference.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:eu-central-1:111122223333:inference-profile/eu.*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:eu-north-1::foundation-model/*",
                "arn:aws:bedrock:eu-west-1::foundation-model/*",  
                "arn:aws:bedrock:eu-west-3::foundation-model/*",  
                "arn:aws:bedrock:eu-central-1::foundation-model/*"
            ],
            "Condition": {
                "StringLike": {
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:eu-central-1:111122223333:inference-profile/eu.*"
                }
            }
        }
    ]
}

To use Amazon Bedrock Data Automation, you must extend your IAM roles with cross-region support similar to the example shown above. See Cross-region support required for Bedrock Data Automation for details.

b) Ensure model access is enabled in the source region

Make sure you have enabled the models in your source region, e.g. in our our example this would be eu-central-1 (Frankfurt). You can check in the AWS console if the model you want to use is already enabled. Nova models available for cross-region inference and model access enabled

c) Extend your existing SCPs to enable cross-region inference usage

Finally, to enable cross-region inference on Amazon Bedrock, you need to extend your Service Control Policies (SCPs). This allows you to use cross-region inference without adding additional governance overhead for new Regions. With this approach, your CloudTrail logs and governance controls remain within your source region, for example, eu-central-1, even when using cross-region inference is used.

The following sample policy allows the usage of all models for EU Regions through inference profiles and cross-region inference:

   {
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyAllOutsideEU",
      "Effect": "Deny",
      "NotAction": [
        "cloudfront:*",       
        "route53:*",
        ...
        "trustedadvisor:*",
        "waf-regional:*",
        "waf:*",
        "wafv2:*",
        "wellarchitected:*"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": [
            "eu-central-1",
            "us-east-1"
          ]
        },
        "ArnNotLike": {
          "aws:PrincipalARN": [
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP",
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP"
          ],
          "bedrock:InferenceProfileArn": [
              "arn:aws:bedrock:eu-central-1:*:inference-profile/eu.*"
          ]
        }
      }
    }
  ]
}

The SCP will take effect on all the accounts under the specific OU that you attached it to. Additionally, it’s possible to restrict the SCP extension to specific model, e.g. by using

"arn:aws:bedrock:eu-central-1:111122223333:inference-profile/eu.amazon.nova-pro-v1:0"

Cross-region inference requests are kept within the regions that are part of the inference profile that was used. For example, a request made with an EU inference profile is kept within EU regions. A specific inference profile version, e.g. in this case version :0 utilizes a fixed list of eu regions. This list could only change through new inference_profile versions coming in the future.

There is no additional routing cost for using cross-region inference. The price is calculated based on the region from which you call an inference profile. For information about pricing, see Amazon Bedrock pricing.

When using cross-region inference, your throughput can reach up to double the default quotas in the region that the inference profile is in. The increase in throughput only applies to invocation performed via inference profiles, the regular quota still applies if you opt for in-region model invocation request. For example, if you invoke the US Anthropic Claude 3 Sonnet inference profile in us-east-1, your throughput can reach up to 1,000 requests per minute and 2,000,000 tokens per minute. To see the default quotas for on-demand throughput, refer to the Runtime quotas section in Quotas for Amazon Bedrock or use the Service Quotas console.

3. Running cross-region inference

Now the inference runs successfully and returns the following result:

~ $ aws bedrock-runtime converse --model-id eu.amazon.nova-pro-v1:0 --messages '[{"role": "user", "content": [{"text": "Describe the purpose of a \"hello world\" program in one line."}]}]'
{
    "output": {
        "message": {
            "role": "assistant",
            "content": [
                {
                    "text": "To demonstrate the basic syntax and structure of a programming language."
                }
            ]
        }
    },
    "stopReason": "end_turn",
    "usage": {
        "inputTokens": 14,
        "outputTokens": 12,
        "totalTokens": 26
    },
    "metrics": {
        "latencyMs": 442
    }
}

4. Observe the CloudTrail log events for the cross-region inference

When you run a cross-region inference request with Amazon Bedrock, AWS generates CloudTrail logs. Below we can see how the cloud trail entry looks like for this cross-region inference request. Cloud Trail logs will be in the region and account where the request originates from (source region) even if the destination region might be different from the source region. Verify in your CloudTrail console that your cross-region inference request was logged.

{
    "eventVersion": "1.11",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AROASIVGK32XK12345678:demo_user@mail.com",
        "arn": "arn:aws:sts::111122223333:assumed-role/YourAssumedRole/username",
        "accountId": "111122223333",
        "accessKeyId": "REDACTED",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "REDACTED",
                "arn": "arn:aws:iam::111122223333:role/aws-reserved/sso.amazonaws.com/eu-central-1/YourAssumedRole",
                "accountId": "111122223333",
                "userName": "YourAssumedRole"
            },
            "attributes": {
                "creationDate": "2025-03-03T08:10:59Z",
                "mfaAuthenticated": "false"
            }
        }
    },
    "eventTime": "2025-03-03T09:01:24Z",
    "eventSource": "bedrock.amazonaws.com",
    "eventName": "Converse",
    "awsRegion": "eu-central-1",
    "sourceIPAddress": "63.176.150.221",
    "userAgent": "aws-cli/2.23.13 md/awscrt#0.23.8 ua/2.0 os/linux#6.1.127-135.201.amzn2023.x86_64 md/arch#x86_64 lang/python#3.12.6 md/pyimpl#CPython exec-env/CloudShell cfg/retry-mode#standard md/installer#exe md/distrib#amzn.2023 md/prompt#off md/command#bedrock-runtime.converse",
    "requestParameters": {
        "modelId": "eu.amazon.nova-pro-v1:0"
    },
    "responseElements": null,
    "requestID": "252ac8af-158b-4be3-82c5-b4d79942c745",
    "eventID": "dc144679-96ce-435c-bf01-b7f5ec96eca9",
    "readOnly": true,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "111122223333",
    "eventCategory": "Management",
    "tlsDetails": {
        "tlsVersion": "TLSv1.3",
        "cipherSuite": "TLS_AES_128_GCM_SHA256",
        "clientProvidedHostHeader": "bedrock-runtime.eu-central-1.amazonaws.com"
    },
    "sessionCredentialFromConsole": "true"
}

If the inference region is different to the source region, there will be the following additional key in the CloudTrail log as follows

# contained in cloud trail events only if inference outside of source region
      "additionalEventData": {
        "inferenceRegion": "eu-west-3" 
    },

If you want to create a CloudWatch metric to track the usage of cross-region inference, you can follow the cross-region sample notebook.

Conclusion

We showed how to effectively extend your existing AWS Landing Zone to leverage Amazon Bedrock's cross-region inference capabilities without introducing additional governance overhead or modifying your regional strategy.

This approach enables your organization to:

  • Maintain existing governance controls while expanding AI inference capabilities
  • Handle traffic spikes efficiently by using cross-region inference
  • Keep all CloudTrail logs and management within your source region
  • Implement fine-grained access control through customized IAM policies

By configuring your Service Control Policies and IAM roles as demonstrated, you can safely open specific paths for Amazon Bedrock's cross-region inference while maintaining your overall regional governance strategy. This pragmatic approach allows you to adopt advanced AI capabilities within your existing landing zone design, helping you balance innovation with compliance and operational efficiency.

6 Comments

Hello, Thanks for the article as this is an issue I have been dealing with the past few weeks.

From my own testing, there is no need to include the specific region foundational model arn in the SCP condition: "arn:aws:bedrock:eu-north-1::foundation-model/",
"arn:aws:bedrock:eu-west-1::foundation-model/", "arn:aws:bedrock:eu-west-3::foundation-model/"

Only the arn of the inference profile seems to be needed: (I am using a wildcard before inference profile to also account for application-profiles for cross-region models) "arn:aws:bedrock:eu-central-1:*:inference-profile/".

Doing this still allows CRI to work with both inference profiles and application inference profiles.

Presumably, the secondary api call to the specific foundational model resource in the other non-governed regions, still contains information that the request is coming from the source region inference profile arn. However, as these logs are not stored in cloudtrail, I cannot confirm this.

Is this something you can please confirm yourselves?

Ideally, we do not want to explicitly allow the specific non-governed region model arns, as this could theoretically allow the use of single region inference and hence data being sent/stored in those non-governed regions.

replied a year ago

You are right, there is no need or effect of having the region specific foundation model list in the SCP, as the requests are coming over the inference profile. Note that even with these entries, the SCP would block direct calls from the non governed regions, as these are not coming over the inference profile. I updated the article to reflect this simplification: Instead of

      "ArnNotLike": {
          "aws:PrincipalARN": [
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP",
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP"
          ],
          "bedrock:InferenceProfileArn": [
              "arn:aws:bedrock:eu-central-1:*:inference-profile/eu.*",
             "arn:aws:bedrock:eu-north-1:*:foundation-model/*",                        
             "arn:aws:bedrock:eu-west-1:*:foundation-model/*",
              "arn:aws:bedrock:eu-west-3:*:foundation-model/*"
          ]

we can use

      "ArnNotLike": {
          "aws:PrincipalARN": [
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP",
            "arn:aws:iam::*:role/Role1AllowedToBypassThisSCP"
          ],
          "bedrock:InferenceProfileArn": [
              "arn:aws:bedrock:eu-central-1:*:inference-profile/eu.*",
          ]

I explicitely tested it with eu-west-3 as source region and a model, like eu.anthropic.claude-3-5-sonnet-20240620-v1:0, which consistently shows that the call is going over eu-central-1.

AWS
EXPERT
replied a year ago

Extending the SCP to allow application inference profiles like 'inferenceProfileArn': 'arn:aws:bedrock:eu-central-1:123456789101:application-inference-profile/vgixe80c1w1v' makes definitely sense, since they allow for tagging and simpler cost tracking within an organization and multiple departments.

AWS
EXPERT
replied a year ago

How would the cost reporting work? Would the cost be attributed to the region that actually served the request or source region of the call?

replied 10 months ago

How would the cost reporting work? Would the cost be attributed to the region that actually served the request or source region of the call?

Based on the documentation the costs are the same as the costs in the source region, even if any other target region would be more expensive. I thus expect that the Cost explorer shows the costs attributed to the source region. There is an attribute in the cloudtrail logs which shows which inference region was actually used. The post contains also a cloudwatch sample dashboard on that.

There's no additional routing cost for using cross-Region inference. The price is calculated based on the Region from which you call an inference profile. For information about pricing, see Amazon Bedrock pricing. from https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html#:~:text=Note%20the%20following%20information%20about%20cross%2DRegion%20inference%3A

AWS
EXPERT
replied 9 months ago

Do I have to opt-in for additional regions, which are not enabled by default to be able to use CRIS?

The answer is provided in the documentation, as follows:

You can use all destination Regions in a cross-Region inference geography regardless of Region-opt status– Certain AWS generative AI services including Amazon Bedrock (see Increase throughput with cross-Region inference) and Amazon Q Developer (see Cross-region processing in Amazon Q Developer) use cross-region inference. If you use those services, they automatically select the optimal AWS Region–including Regions that you have not enabled for resources and IAM data–within your chosen geography. This improves the customer experience by maximizing available compute and model availability.

AWS
EXPERT
replied 9 months ago