Tokens in Bedrock Invoke Agent

0

There are a few questions regarding the token counts in the invoke agent API:

  1. When my agent is connected to a knowledge base, in the boto3event stream, the second trace doesnt contain the input and output tokens but the fifth trace contains some input and output token numbers. From the second question onwards in the same session, the second trace has a strict increasing count but in the fifth trace there is no real pattern (sometimes increases and sometimes decreases).
  2. When I ask some question and especially for current affairs, the LLM uses some search tool which has its own LLM call and its own set of input and output token, and also an extra set of trace with input and output tokens if the tool call fails and it retries.
  3. When the session ends a api call is made to make the summary when the memory is enabled.
  4. When the memory is enabled also there is another search which takes input and output tokens when we are checking in the memory.
  5. There is also another call with input tokens only when we are using a knowledge base associated with an agent for the embeddings.

How to calculate the total token cost for a given asked question when all these variables like assocaited with kb or not, number of tools used and failed and being reused, if memory is enabled, and so on. Also what is use of the differnent traces each having its own input and output tokens.

1 Answer
0
Accepted Answer

Calculating the total token cost for a given question in Amazon Bedrock's Invoke Agent API can be complex due to the various factors you've mentioned. Here's a breakdown of how to approach this:

  1. Knowledge Base Interactions: When an agent is connected to a knowledge base, token counts may appear in different traces. The increasing count in the second trace from the second question onwards likely represents the accumulating conversation history. The fifth trace's fluctuating numbers might reflect specific knowledge base interactions or tool usage.

  2. Tool Usage and Search: When the agent uses search tools or makes additional LLM calls, these contribute to the total token count. Each tool call, including retries, adds to the overall token usage.

  3. Session Summary: If memory is enabled, the API call made at the end of the session for summarization contributes additional tokens.

  4. Memory Checks: With memory enabled, searches within the memory also consume tokens.

  5. Knowledge Base Embeddings: Calls for embeddings when using an associated knowledge base add to the input token count.

To calculate the total token cost, you should sum up all the input and output tokens across all traces and calls related to your question. This includes:

  • Main conversation tokens
  • Knowledge base interaction tokens
  • Tool usage tokens (including search and LLM calls)
  • Memory check tokens
  • Embedding tokens
  • Session summary tokens (if applicable)

The different traces serve various purposes:

  • They help in tracking token usage for different components of the agent's processing.
  • They allow for detailed analysis of how tokens are being used in different stages of the conversation.
  • They can be useful for optimizing performance and managing costs by identifying which parts of the process are token-intensive.

It's important to note that the exact implementation and impact may vary depending on the specific model and configuration you're using. For the most accurate information regarding your specific use case, it's advisable to consult the AWS documentation or reach out to AWS support.

To effectively manage and calculate token usage, consider implementing a system to aggregate these various token counts from all relevant traces and calls. This will give you a comprehensive view of the total token cost for each interaction with the agent.

Sources
The number of output and input tokens in Bedrock responses | AWS re:Post
Tokens processed in Amazon Bedrock | AWS re:Post

profile picture
answered a month ago
profile pictureAWS
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions