Skip to content

Tool XML Text Leak in Nova Sonic 2

0

We have implemented a voice-based conversational agent using the Nova Sonic 2 model. The agent is connected to some tools specific to our use case. Sometimes, the bot produces XML output, and in the logs we see something like this:

< __ function=opensearchSummaryTool> < __ parameter=query> {USER QUERY}

We tried to stop it by filtering out all XML text from the bot response, but the issue is that it is still speaking the audio for this. When we searched about this, we found that it is an ''The Tool XML Text Leak''. How can we resolve this issue?

1 Answer
3
Accepted Answer

As far as I understand your case, the "XML Leak" occurs when the model's internal tool-calling logic is streamed into the output buffer before the orchestration layer can intercept it. For voice-based agents, this is critical because the TTS engine processes these tokens immediately.

I would try the following:

  • Try to switch to the Bedrock Converse API: Instead of manual XML prompting, use the native toolConfig in the Converse API. It strictly separates toolUse blocks from the message content, preventing the audio engine from seeing the XML tags.
  • Define Stop Sequences: Add < or <__ (two _) as Stop Sequences in your inference configuration. This forces the model to stop generating the user-facing response the moment it attempts to invoke a tool.
  • System Prompt Enforcement: Add a directive: "Internal tool calls (XML) must never be part of the verbal response. Use tools silently."
  • Audio Latency Check: Ensure your TTS generation is only triggered by the content block and ignores any tool_use blocks provided by the model response.
EXPERT
answered 24 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.