Skip to content

Amazon Nova Sonic 2 — Infinite tool calling loop with multiple tools

0

We're using Amazon Nova Sonic 2 with bidirectional streaming (InvokeModelWithBidirectionalStream) and multiple tools configured with toolChoice: auto and temperature: 0. When a user query triggers a single tool call, everything works perfectly — the model calls the tool, receives the result, speaks the response, and ends the turn. However, when the model decides to call two or more tools sequentially for a single user query, it often enters an infinite loop after receiving the results from both tools. Instead of generating a spoken response, it immediately calls the same tools again, receives results, calls them again, and so on indefinitely. The loop persists even when we return synthetic "stop" results or empty results — the model keeps issuing new toolUse events. We've also observed that the model sometimes hallucinates tool call XML directly in its textOutput (e.g., <tools><function=tool_name><parameter=query>...</function>) instead of using the proper toolUse event protocol, and the TTS audio for this hallucinated XML plays to the user before it can be suppressed. Has anyone else encountered this behavior with multi-tool configurations on Nova Sonic 2, and are there any recommended patterns for preventing the model from re-invoking tools after it has already received valid results?

2 Answers
0

The behavior you're experiencing with multiple tool calls in Amazon Nova Sonic 2 appears to be related to how the model handles sequential tool invocations and the event-driven architecture of the bidirectional streaming API.

Based on the documented architecture, Amazon Nova Sonic uses an event-based protocol where the client and model exchange structured JSON events to control session lifecycle, audio streaming, text responses, and tool interactions. When using the "auto" tool choice parameter, the model has maximum flexibility to decide whether tools are needed and can call multiple tools if required.

To address the infinite loop issue, consider these approaches:

Tool Choice Configuration: Instead of using "auto" for scenarios where you expect multiple tool calls, you might experiment with the "any" parameter, which ensures at least one tool is called at the beginning of the response with the model selecting the most appropriate one. This may provide more predictable behavior for multi-tool scenarios.

Event Flow Management: Since Amazon Nova Sonic implements asynchronous tool calling that allows the AI to continue conversing naturally while tools run in the background, ensure your application properly manages the complete event lifecycle. After receiving toolUse events and returning toolResult events, verify that you're sending the appropriate contentEnd, promptEnd, and sessionEnd events to properly signal completion of the interaction flow.

Tool Result Structure: When returning tool results, ensure they're formatted as proper toolResult events with clear, conclusive information that signals to the model that the tool execution is complete and no further calls are needed.

The hallucinated XML in textOutput suggests the model may be confused about the proper protocol for tool invocation. This reinforces the importance of clear tool descriptions and proper event handling. Make sure your tool definitions use descriptive, action-oriented names and clear descriptions that help the model understand when and how to use each tool appropriately.

If the issue persists with temperature set to 0 and proper event handling, this may represent a limitation in how the model currently handles complex multi-tool scenarios that would benefit from AWS support engagement.
Sources
Tool configuration - Amazon Nova
Make your web apps hands-free with Amazon Nova Sonic | Artificial Intelligence
Using the Amazon Nova Sonic Speech-to-Speech model - Amazon Nova

answered 24 days ago
0

This is a known pain point with Nova Sonic 2 multi-tool bidirectional streaming. I've seen both issues you described. Here's what worked for us:

1. The infinite loop — root cause and fix

The loop usually happens because the conversation state isn't cleanly closing the tool-use cycle before the next turn begins. The model sees "tool results received" but the stream context still looks like it's mid-tool-use, so it re-invokes.

Fix: Implement an explicit tool call state machine

Track tool state explicitly on your side:

class ToolCallState:
    def __init__(self):
        self.pending_calls = {}      # toolUseId -> tool_name
        self.completed_calls = set() # toolUseIds already returned results
        self.turn_tool_calls = []    # all tool calls in this turn
    
    def register_call(self, tool_use_id, tool_name, input_data):
        # Deduplicate — if same tool+input already called this turn, skip
        call_signature = f"{tool_name}:{json.dumps(input_data, sort_keys=True)}"
        if call_signature in [c['sig'] for c in self.turn_tool_calls]:
            return False  # already called this
        self.pending_calls[tool_use_id] = tool_name
        self.turn_tool_calls.append({'id': tool_use_id, 'sig': call_signature})
        return True
    
    def mark_completed(self, tool_use_id):
        self.completed_calls.add(tool_use_id)
        self.pending_calls.pop(tool_use_id, None)
    
    def all_results_returned(self):
        return len(self.pending_calls) == 0
    
    def reset_turn(self):
        self.pending_calls = {}
        self.completed_calls = set()
        self.turn_tool_calls = []

Only send the next conversationTurn event after all_results_returned() is True and you've confirmed the model has moved to response generation (contentBlockStop received for all tool result blocks).

2. Ensure tool results are formatted correctly in stream context

A common cause of re-invocation is malformed tool result blocks in the bidirectional stream. The model doesn't recognise them as valid results and re-asks.

Make sure your toolResult block includes toolUseId matching exactly what came in the toolUse event:

tool_result_event = {
    "role": "user",
    "content": [
        {
            "toolResult": {
                "toolUseId": tool_use_id,  # must match exactly
                "content": [
                    {
                        "text": json.dumps(result)  # always stringify
                    }
                ],
                "status": "success"  # explicit status matters
            }
        }
    ]
}

Missing status or mismatched toolUseId silently causes re-invocation in our testing.

3. Hallucinated XML in textOutput — suppress before TTS

This is a Nova Sonic 2 specific quirk. Filter textOutput chunks before passing to TTS:

import re

TOOL_XML_PATTERN = re.compile(
    r'<tools>.*?</tools>|<function=.*?</function>|<tool_call>.*?</tool_call>',
    re.DOTALL | re.IGNORECASE
)

def filter_tts_text(text_chunk: str) -> str:
    """Remove hallucinated tool XML before sending to TTS."""
    # Also catch partial XML at chunk boundaries
    if '<tools>' in text_chunk or '<function=' in text_chunk:
        return re.sub(TOOL_XML_PATTERN, '', text_chunk).strip()
    return text_chunk

# In your stream handler
if event_type == 'textOutput':
    clean_text = filter_tts_text(event['text'])
    if clean_text:  # only send non-empty chunks to TTS
        send_to_tts(clean_text)

Also buffer chunks until you see a sentence boundary — partial XML sometimes spans two chunks.

4. Add a circuit breaker for tool re-invocation

Even with the above fixes, add a hard circuit breaker:

MAX_TOOL_ROUNDS = 3  # model should never need more than this

class ConversationManager:
    def __init__(self):
        self.tool_round_count = 0
    
    def on_tool_use_event(self, tool_use_id, tool_name, tool_input):
        self.tool_round_count += 1
        
        if self.tool_round_count > MAX_TOOL_ROUNDS:
            # Force end the turn — return a synthetic stop result
            return {
                "toolUseId": tool_use_id,
                "content": [{"text": "Maximum tool calls reached. Please respond with information gathered so far."}],
                "status": "success"
            }
        
        return self.execute_tool(tool_name, tool_input)
    
    def reset_turn(self):
        self.tool_round_count = 0

5. Consider toolChoice: required with explicit tool list

With toolChoice: auto and multiple tools, Nova Sonic 2 can get into an indeterminate state. For turns where you know the user's intent needs specific tools, switch to toolChoice: any or pass the specific tool name. Reserve auto for open-ended turns.


Summary of fixes:

  • Explicit state machine tracking tool call lifecycle per turn
  • Deduplicate tool calls by tool_name + input signature
  • Always include status: success in toolResult blocks
  • Regex filter textOutput chunks before TTS
  • Hard circuit breaker at 3 tool rounds
  • Prefer toolChoice: any over auto for multi-tool scenarios

Hope this unblocks you — bidirectional streaming with multi-tool is genuinely tricky to get right.

answered 15 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.