- Newest
- Most votes
- Most comments
Unfortunately, there isn't a direct way to track the embedding token count for the invoke_agent API without using CloudWatch. The embedding process, which occurs when a knowledge base (KB) is associated with the agent, does indeed consume tokens and incur costs, but this information is not explicitly provided in the API response.
The token usage for embeddings is typically only visible through CloudWatch metrics and logs. CloudWatch provides detailed insights into the token consumption, including both input and output tokens used during the embedding process and model invocation.
While the invoke_agent API response does include some information about token usage, it primarily focuses on the tokens used for the actual model invocation rather than the embedding process. The embedding token count is not directly exposed through the API response structure.
If you need to track these embedding token counts without relying on CloudWatch, you might face some challenges. Currently, there's no alternative API or method provided by Amazon Bedrock to retrieve this specific information outside of CloudWatch.
For more granular tracking and attribution of token usage, including embeddings, you would typically need to use CloudWatch Logs Insights. This allows you to query the invocation logs and aggregate token usage by different identities or applications.
If tracking these embedding token counts is crucial for your use case, you may need to consider integrating CloudWatch into your monitoring workflow or reaching out to AWS support for potential alternatives or future enhancements to the API that could expose this information more directly.
Sources
The number of output and input tokens in Bedrock responses | AWS re:Post
Improve visibility into Amazon Bedrock usage and performance with Amazon CloudWatch | AWS Machine Learning Blog
Relevant content
- asked 7 months ago
- asked 4 months ago