Skip to content

Amazon Bedrock Inference Cost granularity based on IAM

6 minute read
Content level: Advanced
0

Answer to Who and what is driving our Amazon Bedrock spend?

1. The Problem: AI Inference Spend Without Visibility

Before this feature, Amazon Bedrock usage appeared on your AWS bill as aggregated charges per model, per region, and per token direction (input vs. output). That is useful for a total, but it leaves the most important business questions unanswered:

  • Which team or cost center is responsible for this month's Bedrock charges?
  • Is a single power user or a single misbehaving application driving the increase?
  • When we bill internal business units or external tenants, what share should each one pay?
  • Which models are being consumed by which projects, and is that aligned with budget?

Without per-caller attribution, organizations typically resorted to workarounds — separate AWS accounts per team, custom logging pipelines that parse CloudTrail events and join them with token counts, homegrown proxies that stamp metadata on each request, or manual allocation based on rough estimates. Each of these adds engineering cost, operational risk, and delay.

Common Pain Points

Pain PointBusiness Impact
No visibility into who is calling BedrockChargeback and showback are impossible or rely on estimates
Inference costs appear as one aggregate lineBudget owners cannot forecast or set guardrails per team
Shared roles hide individual users behind one identityMulti-tenant SaaS cannot bill tenants fairly
Custom instrumentation needed to map usage to usersEngineering teams build and maintain reporting plumbing instead of product
Cost spikes hard to diagnoseSlow response to runaway jobs or prompt-injection abuse

2. The Challenge: Why Per-Caller Attribution Is Hard

Attributing inference cost accurately is harder than it looks. A production Bedrock workload can involve:

  • Many identity shapes. A single account may mix IAM users, IAM roles attached to compute (EC2, Lambda, ECS, EKS), Bedrock API keys, and federated identities that assume a shared role.
  • Ephemeral sessions. STS-issued credentials last minutes to hours, making it hard to stitch together usage to a durable business entity like a team or cost center.
  • Gateways and proxies. An LLM gateway typically authenticates the end user at its own layer and then calls Bedrock with one service role, collapsing every user into a single identity on the AWS bill.
  • Variable pricing dimensions. Pricing depends on model, region, input vs. output tokens, and feature (batch, provisioned throughput, guardrails, knowledge bases). Attribution must preserve all of these.
  • Tamper resistance. If users can set their own cost attribution tags at request time, chargeback reports become unreliable.

A robust solution has to honor all of these constraints without forcing customers to rewrite applications or operate new infrastructure.


3. The Solution: Granular Cost Attribution, Built In

Amazon Bedrock now captures the IAM principal ARN on every inference request and emits it into the billing pipeline as a first-class field. Combined with IAM principal tags and session tags, this gives you two complementary dimensions:

  1. Identity attribution — a new line_item_iam_principal column in CUR 2.0 shows the exact ARN that made each call.
  2. Business dimension aggregation — IAM tags flow into billing data with the iamPrincipal/ prefix, so you can group spend by team, project, cost center, tenant, or any custom dimension.

Key Properties

  • Automatic. No code changes, no new SDKs, no agents. Attribution happens server-side.
  • Universal. Works across all Bedrock-supported foundation models and inference modes.
  • Tamper resistant. Federated session tags are cryptographically signed inside the OIDC token or SAML assertion.
  • No additional charge. Available in commercial AWS regions as part of Bedrock and AWS Billing.
  • Familiar tooling. Results appear in CUR 2.0 and Cost Explorer, the same tools your FinOps team already uses.

What You See in CUR 2.0

With IAM principal data enabled on your data export, CUR 2.0 rows for Bedrock inference include both the caller identity and the tags attached to that identity or session.

line_item_iam_principalline_item_usage_typeline_item_unblended_costtags
arn:aws:iam::123456789012:user/aliceUSE1-Claude4.6Sonnet-input-tokens$0.069{"iamPrincipal/team":"ds"}
arn:aws:iam::123456789012:user/aliceUSE1-Claude4.6Sonnet-output-tokens$0.214{"iamPrincipal/team":"ds"}
arn:aws:sts::123456789012:assumed-role/AppRole/session-123USE1-Claude4.6Opus-input-tokens$0.198{"iamPrincipal/project":"chatbot"}

The line_item_usage_type column continues to encode region, model, and token direction, so you can slice spend by any combination of caller, model, and direction.

Identity Shapes Supported

How Bedrock Was CalledValue in line_item_iam_principal
IAM user…user/alice
Bedrock API key (backed by an IAM user)…user/BedrockAPIKey-234s
IAM role assumed by an application (Lambda, EC2, ECS, EKS)…assumed-role/AppRole/session-name
Federated user from an IdP (Okta, Entra ID, Auth0, Cognito)…assumed-role/Role/user@acme.org

4. Prerequisites and One-Time Setup

Before attribution appears in your reports, complete these one-time steps.

Step 1 — Enable IAM Principal Data in CUR 2.0

The line_item_iam_principal column is opt-in per data export configuration.

  1. Open the AWS Billing and Cost Management console.
  2. Navigate to Data Exports and select your CUR 2.0 export (or create a new one).
  3. In the data export configuration, enable IAM principal data.
  4. Save the configuration.

New exports will include the line_item_iam_principal column going forward.

Step 2 — Plan Your Tagging Strategy

Decide which business dimensions you want to aggregate by. Common choices:

  • team — data-science, engineering, marketing
  • cost-center — finance codes for internal chargeback
  • environment — dev, staging, prod

Fewer, well-governed tags outperform many ad-hoc tags. Refer to AWS's tagging best practices when designing your schema.

Step 3 — Activate Cost Allocation Tags

After your tags appear on at least one Bedrock request (allow up to 24 hours after first use):

  1. Open the AWS Billing console.
  2. Navigate to Cost allocation tags.
  3. Under the IAM category, locate the tags you have applied (they appear with the iamPrincipal/ prefix).
  4. Select them and choose Activate.

Alternatively, use the UpdateCostAllocationTagsStatus API for automation.

Activated tags appear in CUR 2.0 and Cost Explorer within 24–48 hours.