- Newest
- Most votes
- Most comments
If AWS folks are eventually reading this, it's also a feature request. Update cost information on multimodal inference. Like Anthropic does (and I think OpenAI, too). Example: https://docs.anthropic.com/en/docs/build-with-claude/vision#calculate-image-costs
When using Claude Sonnet 3.5 through AWS Bedrock for multimodal inference, you're correct that the input token count provided only reflects the text prompt and not the image. This can indeed make it challenging to estimate the actual costs in real-time. However, there are a few ways to get a more accurate understanding of your costs:
-
Response Headers: After each Bedrock API call, check the response headers. You should find headers like 'x-amzn-bedrock-input-token-count' and 'x-amzn-bedrock-output-token-count'. These provide precise token counts for both input and output, which you can use to calculate costs based on the pricing model.
-
Pricing Structure: For Claude models on Bedrock, the pricing is based on the total number of tokens processed, including both input and output. The image processing is included in this token count, but it's not directly reported as tokens. Instead, the image is likely converted into a format the model can process, which contributes to the overall token count.
-
Cost Estimation: To estimate costs more accurately, you can create a small sample set of requests with various text lengths and image sizes. Run these through the model and note the total token counts from the response headers. This will give you a better idea of how images of different sizes affect the token count and, consequently, the cost.
-
Detailed Billing: While Cost Explorer shows costs with a delay, you can set up detailed billing reports that provide more granular data. These reports can be analyzed using tools like Amazon Athena to break down costs per invocation or resource.
-
CloudWatch Logs: If you're logging your Bedrock calls, you can include the token counts from the response headers in your logs. This allows you to analyze usage patterns and costs more closely.
-
AWS Support: If you're planning large-scale usage, it might be worth contacting AWS Support for more detailed guidance on cost estimation for your specific use case.
Remember, the pricing for multimodal models like Claude can be different from text-only models, and the way images are processed and counted towards your bill may not be directly comparable to other services or even to Anthropic's direct offerings. Always refer to the official AWS Bedrock pricing page for the most up-to-date information.
Given the uncertainty, it's wise to start with a smaller batch of inferences to gauge actual costs before scaling up to larger volumes. This approach will help you better understand the real-world costs associated with your specific usage patterns.
Sources
monitor Bedrock model charge per invocation resource | AWS re:Post
Amazon Titan Multimodal Embeddings G1 model - Token Usage (bedrock) | AWS re:Post
Relevant content
- AWS OFFICIALUpdated 17 days ago
- AWS OFFICIALUpdated 23 days ago
Unfortunately this is mostly quite vague. Would be great to get more information on how to set up advanced billing reports here.