Embedding token count in Bedrock

0

When I use the invoke_agent API, if there is no knowledge kb associated with it doesnt do any embedding and if there is a kb then it does embedding of the question which is understandable since it uses these embeddings in the searching of the kb. But these embeddings consume some cost which is given by input tokens when observed through cloudwatch. The problem is that I cant see these input token count in the invoke_agent API trace or any other trace and they are visible only in the cloudwatch. Since these also contribute to the cost incurred I want a method where I can track these without having to use the cloudwatch. Is it possible through some APIs or not.

1 Answer
0

Unfortunately, there isn't a direct way to track the embedding token count for the invoke_agent API without using CloudWatch. The embedding process, which occurs when a knowledge base (KB) is associated with the agent, does indeed consume tokens and incur costs, but this information is not explicitly provided in the API response.

The token usage for embeddings is typically only visible through CloudWatch metrics and logs. CloudWatch provides detailed insights into the token consumption, including both input and output tokens used during the embedding process and model invocation.

While the invoke_agent API response does include some information about token usage, it primarily focuses on the tokens used for the actual model invocation rather than the embedding process. The embedding token count is not directly exposed through the API response structure.

If you need to track these embedding token counts without relying on CloudWatch, you might face some challenges. Currently, there's no alternative API or method provided by Amazon Bedrock to retrieve this specific information outside of CloudWatch.

For more granular tracking and attribution of token usage, including embeddings, you would typically need to use CloudWatch Logs Insights. This allows you to query the invocation logs and aggregate token usage by different identities or applications.

If tracking these embedding token counts is crucial for your use case, you may need to consider integrating CloudWatch into your monitoring workflow or reaching out to AWS support for potential alternatives or future enhancements to the API that could expose this information more directly.
Sources
The number of output and input tokens in Bedrock responses | AWS re:Post
Improve visibility into Amazon Bedrock usage and performance with Amazon CloudWatch | AWS Machine Learning Blog

profile picture
answered a month ago
AWS
SUPPORT ENGINEER
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions