- Newest
- Most votes
- Most comments
The behavior you're experiencing with multiple tool calls in Amazon Nova Sonic 2 appears to be related to how the model handles sequential tool invocations and the event-driven architecture of the bidirectional streaming API.
Based on the documented architecture, Amazon Nova Sonic uses an event-based protocol where the client and model exchange structured JSON events to control session lifecycle, audio streaming, text responses, and tool interactions. When using the "auto" tool choice parameter, the model has maximum flexibility to decide whether tools are needed and can call multiple tools if required.
To address the infinite loop issue, consider these approaches:
Tool Choice Configuration: Instead of using "auto" for scenarios where you expect multiple tool calls, you might experiment with the "any" parameter, which ensures at least one tool is called at the beginning of the response with the model selecting the most appropriate one. This may provide more predictable behavior for multi-tool scenarios.
Event Flow Management: Since Amazon Nova Sonic implements asynchronous tool calling that allows the AI to continue conversing naturally while tools run in the background, ensure your application properly manages the complete event lifecycle. After receiving toolUse events and returning toolResult events, verify that you're sending the appropriate contentEnd, promptEnd, and sessionEnd events to properly signal completion of the interaction flow.
Tool Result Structure: When returning tool results, ensure they're formatted as proper toolResult events with clear, conclusive information that signals to the model that the tool execution is complete and no further calls are needed.
The hallucinated XML in textOutput suggests the model may be confused about the proper protocol for tool invocation. This reinforces the importance of clear tool descriptions and proper event handling. Make sure your tool definitions use descriptive, action-oriented names and clear descriptions that help the model understand when and how to use each tool appropriately.
If the issue persists with temperature set to 0 and proper event handling, this may represent a limitation in how the model currently handles complex multi-tool scenarios that would benefit from AWS support engagement.
Sources
Tool configuration - Amazon Nova
Make your web apps hands-free with Amazon Nova Sonic | Artificial Intelligence
Using the Amazon Nova Sonic Speech-to-Speech model - Amazon Nova
This is a known pain point with Nova Sonic 2 multi-tool bidirectional streaming. I've seen both issues you described. Here's what worked for us:
1. The infinite loop — root cause and fix
The loop usually happens because the conversation state isn't cleanly closing the tool-use cycle before the next turn begins. The model sees "tool results received" but the stream context still looks like it's mid-tool-use, so it re-invokes.
Fix: Implement an explicit tool call state machine
Track tool state explicitly on your side:
class ToolCallState: def __init__(self): self.pending_calls = {} # toolUseId -> tool_name self.completed_calls = set() # toolUseIds already returned results self.turn_tool_calls = [] # all tool calls in this turn def register_call(self, tool_use_id, tool_name, input_data): # Deduplicate — if same tool+input already called this turn, skip call_signature = f"{tool_name}:{json.dumps(input_data, sort_keys=True)}" if call_signature in [c['sig'] for c in self.turn_tool_calls]: return False # already called this self.pending_calls[tool_use_id] = tool_name self.turn_tool_calls.append({'id': tool_use_id, 'sig': call_signature}) return True def mark_completed(self, tool_use_id): self.completed_calls.add(tool_use_id) self.pending_calls.pop(tool_use_id, None) def all_results_returned(self): return len(self.pending_calls) == 0 def reset_turn(self): self.pending_calls = {} self.completed_calls = set() self.turn_tool_calls = []
Only send the next conversationTurn event after all_results_returned() is True and you've confirmed the model has moved to response generation (contentBlockStop received for all tool result blocks).
2. Ensure tool results are formatted correctly in stream context
A common cause of re-invocation is malformed tool result blocks in the bidirectional stream. The model doesn't recognise them as valid results and re-asks.
Make sure your toolResult block includes toolUseId matching exactly what came in the toolUse event:
tool_result_event = { "role": "user", "content": [ { "toolResult": { "toolUseId": tool_use_id, # must match exactly "content": [ { "text": json.dumps(result) # always stringify } ], "status": "success" # explicit status matters } } ] }
Missing status or mismatched toolUseId silently causes re-invocation in our testing.
3. Hallucinated XML in textOutput — suppress before TTS
This is a Nova Sonic 2 specific quirk. Filter textOutput chunks before passing to TTS:
import re TOOL_XML_PATTERN = re.compile( r'<tools>.*?</tools>|<function=.*?</function>|<tool_call>.*?</tool_call>', re.DOTALL | re.IGNORECASE ) def filter_tts_text(text_chunk: str) -> str: """Remove hallucinated tool XML before sending to TTS.""" # Also catch partial XML at chunk boundaries if '<tools>' in text_chunk or '<function=' in text_chunk: return re.sub(TOOL_XML_PATTERN, '', text_chunk).strip() return text_chunk # In your stream handler if event_type == 'textOutput': clean_text = filter_tts_text(event['text']) if clean_text: # only send non-empty chunks to TTS send_to_tts(clean_text)
Also buffer chunks until you see a sentence boundary — partial XML sometimes spans two chunks.
4. Add a circuit breaker for tool re-invocation
Even with the above fixes, add a hard circuit breaker:
MAX_TOOL_ROUNDS = 3 # model should never need more than this class ConversationManager: def __init__(self): self.tool_round_count = 0 def on_tool_use_event(self, tool_use_id, tool_name, tool_input): self.tool_round_count += 1 if self.tool_round_count > MAX_TOOL_ROUNDS: # Force end the turn — return a synthetic stop result return { "toolUseId": tool_use_id, "content": [{"text": "Maximum tool calls reached. Please respond with information gathered so far."}], "status": "success" } return self.execute_tool(tool_name, tool_input) def reset_turn(self): self.tool_round_count = 0
5. Consider toolChoice: required with explicit tool list
With toolChoice: auto and multiple tools, Nova Sonic 2 can get into an indeterminate state. For turns where you know the user's intent needs specific tools, switch to toolChoice: any or pass the specific tool name. Reserve auto for open-ended turns.
Summary of fixes:
- Explicit state machine tracking tool call lifecycle per turn
- Deduplicate tool calls by tool_name + input signature
- Always include
status: successin toolResult blocks - Regex filter textOutput chunks before TTS
- Hard circuit breaker at 3 tool rounds
- Prefer
toolChoice: anyoverautofor multi-tool scenarios
Hope this unblocks you — bidirectional streaming with multi-tool is genuinely tricky to get right.
Relevant content
- asked 4 months ago
- asked 6 months ago
