Skip to content

ConverseStream returns garbled/truncated tool input JSON for openai.* models (ConverseCommand works correctly)

0

Summary

When using the Bedrock ConverseStream API with openai.gpt-oss-120b-1:0, the streamed tool input deltas are truncated and produce invalid JSON. The same request using Converse (non-streaming) returns the correct tool input. Other model families (e.g. zai.glm-5) work correctly with both APIs.

Environment

  • Region: us-east-1
  • Model: openai.gpt-oss-120b-1:0
  • Comparison model: zai.glm-5 (works correctly)
  • SDK: @aws-sdk/client-bedrock-runtime 3.821.0 (also tested with latest)
  • Runtime: Node.js 22.x

Reproduction

Minimal self-contained script — save as repro.mjs and run with node repro.mjs:

import {
  BedrockRuntimeClient,
  ConverseCommand,
  ConverseStreamCommand,
} from "@aws-sdk/client-bedrock-runtime";

const client = new BedrockRuntimeClient({ region: "us-east-1" });

const payload = {
  modelId: "openai.gpt-oss-120b-1:0",
  messages: [
    { role: "user", content: [{ text: "What is the weather in New York?" }] },
  ],
  toolConfig: {
    tools: [
      {
        toolSpec: {
          name: "getWeather",
          description: "Get current weather for a given city",
          inputSchema: {
            json: {
              type: "object",
              properties: {
                city: { type: "string", description: "City name" },
              },
              required: ["city"],
            },
          },
        },
      },
    ],
    toolChoice: { any: {} },
  },
};

// --- Test 1: Converse (non-streaming) ---
console.log("=== Converse (non-streaming) ===");
const converseRes = await client.send(new ConverseCommand(payload));
const toolUse = converseRes.output?.message?.content?.find((b) => b.toolUse);
console.log("stopReason:", converseRes.stopReason);
console.log("toolUse.input:", JSON.stringify(toolUse?.toolUse?.input));

// --- Test 2: ConverseStream (streaming) ---
console.log("\n=== ConverseStream (streaming) ===");
for (let run = 1; run <= 3; run++) {
  const streamRes = await client.send(new ConverseStreamCommand(payload));
  let stopReason = null;
  let toolName = null;
  const toolDeltas = [];

  for await (const event of streamRes.stream) {
    if (event.contentBlockStart?.start?.toolUse) {
      toolName = event.contentBlockStart.start.toolUse.name;
    }
    if (event.contentBlockDelta?.delta?.toolUse?.input) {
      toolDeltas.push(event.contentBlockDelta.delta.toolUse.input);
    }
    if (event.messageStop) {
      stopReason = event.messageStop.stopReason;
    }
  }

  const accumulated = toolDeltas.join("");
  let parseStatus;
  try {
    JSON.parse(accumulated);
    parseStatus = "valid JSON";
  } catch {
    parseStatus = "INVALID JSON";
  }

  console.log(
    `Run ${run}: stopReason=${stopReason}, tool=${toolName}, ` +
    `deltas=${JSON.stringify(toolDeltas)}, accumulated="${accumulated}" [${parseStatus}]`
  );
}

Observed Output

=== Converse (non-streaming) ===
stopReason: tool_use
toolUse.input: {"city":"New York"}

=== ConverseStream (streaming) ===
Run 1: stopReason=tool_use, tool=getWeather, deltas=["{\n","\":","\""], accumulated="{
":"
" [INVALID JSON]
Run 2: stopReason=tool_use, tool=getWeather, deltas=["{\n","\":","\""], accumulated="{
":"
" [INVALID JSON]
Run 3: stopReason=tool_use, tool=getWeather, deltas=["{\n "," \"","New"], accumulated="{
  "New" [INVALID JSON]

Expected Output

ConverseStream should return tool input deltas that, when concatenated, produce valid JSON identical to the Converse response: {"city":"New York"}.

For example, valid deltas would be: ["{\n \"city", "\": \"New York\"", "\n}"].

Key Observations

  1. Converse works, ConverseStream does not — same payload, same model, same region. The non-streaming API returns {"city":"New York"} reliably. The streaming API returns truncated/garbled fragments.

  2. Model-specific — the same test with zai.glm-5 produces correct, parseable JSON in both Converse and ConverseStream. This rules out any client-side issue.

  3. Deterministic — the streaming corruption is consistent across runs (tested 3+ times per session, across multiple days).

  4. stopReason is correct — ConverseStream correctly returns stopReason: tool_use and emits contentBlockStart with the tool name. Only the contentBlockDelta tool input fragments are corrupted.

  5. No SDK code changes in the version window — we checked the @aws-sdk/client-bedrock-runtime changelog for April 2–16, 2026: all releases were "version bump only" with no code changes, confirming this is a server-side issue.

Cross-Model Comparison

APIopenai.gpt-oss-120b-1:0zai.glm-5
Converse{"city":"New York"}{"city":"New York"}
ConverseStream❌ garbled / invalid JSON{"city":"New York"}

Reproduction with Vercel AI SDK (@ai-sdk/amazon-bedrock)

The same bug manifests through the Vercel AI SDK, which uses Converse/ConverseStream under the hood. Save as repro-vercel.mjs:

import { createAmazonBedrock } from "@ai-sdk/amazon-bedrock";
import { generateText, streamText, tool } from "ai";
import { z } from "zod";

const bedrock = createAmazonBedrock({ region: "us-east-1" });

const models = ["openai.gpt-oss-120b-1:0", "zai.glm-5"];

const tools = {
  getWeather: tool({
    description: "Get current weather for a given city",
    inputSchema: z.object({ city: z.string().describe("City name") }),
  }),
};

const messages = [{ role: "user", content: "What is the weather in New York?" }];

for (const modelId of models) {
  console.log(`\n=== ${modelId} ===`);

  // generateText → uses Converse
  const gen = await generateText({
    model: bedrock(modelId), messages, tools, toolChoice: "required", maxSteps: 1,
  });
  console.log(`generateText: ${gen.toolCalls[0]?.toolName}(${JSON.stringify(gen.toolCalls[0]?.input)})`);

  // streamText → uses ConverseStream
  const stream = streamText({
    model: bedrock(modelId), messages, tools, toolChoice: "required", maxSteps: 1,
  });
  const parts = [];
  for await (const part of stream.fullStream) parts.push(part);
  const tc = parts.find((p) => p.type === "tool-call");
  console.log(`streamText:   ${tc?.toolName}(${JSON.stringify(tc?.input)})`);
}

Observed output:

=== openai.gpt-oss-120b-1:0 ===
generateText: getWeather({"city":"New York"})
streamText:   getWeather("{\n  \"New")

=== zai.glm-5 ===
generateText: getWeather({"city":"New York"})
streamText:   getWeather({"city":"New York"})

Versions tested: ai 6.0.86, @ai-sdk/amazon-bedrock 4.0.93, zod 4.3.6.

This confirms the Vercel AI SDK faithfully receives the garbled deltas from ConverseStream and reassembles them as-is — the corruption originates server-side.

Impact

This bug makes it impossible to use ConverseStream with openai.* models when tools are involved. Any framework that uses ConverseStream for agentic streaming (e.g. Vercel AI SDK, Mastra, LangChain) will receive invalid tool arguments, causing tool calls to fail silently or with empty {} args.

The workaround we've implemented is to simulate streaming by using the Converse (non-streaming) API and wrapping the response in a synthetic stream. This works but defeats the purpose of streaming (no incremental token delivery).

Tags

Amazon Bedrock, ConverseStream, tool_use, openai.gpt-oss-120b-1:0, streaming, tool calling

Timeline

The same application code with the same model (openai.gpt-oss-120b-1:0) was working correctly with ConverseStream tool calls until Monday April 13, 2026. The issue appeared without any changes to the client code or SDK versions.

3 Answers
0

Thank you for the detailed bug report. Based on the information provided and the Amazon Bedrock documentation, I can confirm that your implementation appears correct and follows the expected patterns for using the ConverseStream API.

According to the ConverseStream specification, when streaming responses that include tool use, the API should emit events in this order:

  1. messageStart - Contains the role
  2. contentBlockStart - Marks the start of a content block (including tool use)
  3. contentBlockDelta - Contains partial content, including the toolUse field with partial input JSON
  4. contentBlockStop - Marks the end of the content block
  5. messageStop - Contains the stop reason
  6. metadata - Contains usage and metrics

The contentBlockDelta events should contain partial JSON in the toolUse.input field that, when concatenated together, forms valid JSON. Your observation that the deltas are producing garbled or truncated JSON (like ["{\\n","\":","\""] instead of valid fragments like ["{\\n \\"city", "\\": \\"New York\\"", "\\n}"]) indicates the streaming response is not being properly formatted by the service.

The fact that:

  • The non-streaming Converse API returns correct JSON ({"city":"New York"})
  • The same ConverseStream code works correctly with other models (like zai.glm-5)
  • The stopReason is correctly set to tool_use
  • The issue is consistent and reproducible

This strongly suggests a service-side issue specific to the openai.gpt-oss-120b-1:0 model's streaming implementation, rather than a client-side problem.

Since this appears to be a model-specific service issue that emerged recently (around April 13, 2026) without any client code changes, I recommend:

  1. Opening a support case with AWS to report this as a potential service regression
  2. Continuing to use your workaround with the non-streaming Converse API for this specific model
  3. Monitoring for service updates or announcements regarding the openai.* model family

Your reproduction scripts are excellent and should help AWS engineering teams identify and resolve the issue quickly.
Sources
ConverseStream - Amazon Bedrock
Using the Converse API - Amazon Bedrock

answered a month ago
EXPERT
reviewed a month ago
0

I also faced this issue with converse_stream for toolUse arguments

MODEL_ID = "openai.gpt-oss-120b-1:0"
# MODEL_ID = "us.anthropic.claude-haiku-4-5-20251001-v1:0"

bedrock_config = Config(
    region_name=AWS_REGION,
    max_pool_connections=100,
    retries={"max_attempts": 3, "mode": "standard"},
    connect_timeout=3,
    read_timeout=60,
    tcp_keepalive=True
)

client = boto3.client(
    service_name='bedrock-runtime',
    region_name=AWS_REGION,
    config=bedrock_config,
    aws_access_key_id=AWS_ACCESS_KEY,
    aws_secret_access_key=AWS_SECRET_KEY
)


class AwsBedRockLLM:

    @classmethod
    def make_stream(
            cls,
            messages,
            model,
            system_message,
            tool_list=None,
            instructions=None,
            agent_id=None,
            call_id=None,
    ):
        system_prompt = instructions if instructions else system_message
        payload = {
            "modelId": model,
            "system": [{"text": system_prompt}] if system_prompt else [],
            "messages": cls.convert_messages_for_bedrock(messages),
            "inferenceConfig": {
                "temperature": 0.5,
                "maxTokens": 800,
                # "topP": 0.9
            },
            # "performanceConfig": {"latency": "standard"}
        }
        # document here: https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

        if tool_list:
            payload["toolConfig"] = {
                "tools": tool_list,
                "toolChoice": {
                    "auto": {}
                }
            }

        response = client.converse_stream(**payload)

        return response
answered 21 days ago
-1

Hello.

I tested the following code using Boto3, and it seems to be working without any problems.
The issue you encountered is most likely an AWS-side problem, but it could also be an SDK-side issue.
Therefore, please try creating an issue in the following GitHub repository along with contacting AWS support.
https://github.com/aws/aws-sdk-js-v3/issues

import boto3

MODEL_ID = "openai.gpt-oss-120b-1:0"


def main() -> None:
    client = boto3.client("bedrock-runtime", region_name="us-east-1")
    system = [{"text": "Please answer the question in English."}]
    messages = [
        {
            "role": "user",
            "content": [{"text": "Which is cuter, dogs or cats?"}],
        }
    ]
    inference_config = {"maxTokens": 1024, "temperature": 1.0, "topP": 1.0}
    additional_confg = {"reasoning_effort": "medium"}
    response = client.converse_stream(
        modelId=MODEL_ID,
        system=system,
        messages=messages,
        inferenceConfig=inference_config,
        additionalModelRequestFields=additional_confg,
    )

    for chunk in response["stream"]:
        if "contentBlockDelta" in chunk:
            # get reasoningContent and final answer
            delta = chunk["contentBlockDelta"]["delta"]
            if "text" in delta:
                print(delta["text"], end="", flush=True)


if __name__ == "__main__":
    main()

Incidentally, when you run it, you will see a response like the following.

python bedrock_test.py 

The short answer is: **both can be adorable, and which one you find “cuter” really depends on your personal taste.**  

Below is a quick rundown of the typical traits that make dogs and cats endearing to different people. You can use it to decide which side of the cuteness debate resonates with you (or just enjoy both!).

| Feature | Dogs (Why they’re often called “cute”) | Cats (Why they’re often called “cute”) |
|---------|----------------------------------------|----------------------------------------|
| **Facial Expressions** | Big, expressive eyes; wagging tongues; a “puppy‑dog” look that signals friendliness and a desire to please. | Large, almond‑shaped eyes; slow blinks (the “cat kiss”) that signal trust; subtle ear and whisker movements. |
| **Body Language** | Playful bows, tail wags, goofy clumsiness, and the “play‑dead” flop that looks irresistibly helpless. | Graceful stretches, “kneading” paws, and the classic “loaf” pose that’s both sleek and cuddly. |
| **Vocalizations** | Excited barks, whines, and happy panting that can sound like a heartfelt “hey, I’m happy to see you!” | Soft purrs (often a sign of contentment) and gentle chirps when they’re intrigued. |
| **Variety of Sizes & Breeds** | From tiny Chihuahuas with oversized ears to fluffy Newfoundlands that look like giant stuffed toys. | From petite Singapuras that fit in a palm to majestic Maine Coons that seem like mini‑lions. |
| **Social Interaction** | Generally eager to be near you, follow you around, and seek physical contact (petting, hugging, even “lap‑dog” moments). | Often more independent, but when they choose to curl up on your lap, it feels like a special, selective approval. |
| **Playfulness** | High‑energy fetch, zoomies, and goofy “head‑butts” that make them look hilariously enthusiastic. | Light pouncing, laser‑pointer chases, and “pounce‑and‑run” games that showcase their agility and curiosity. |
| **Cuteness Triggers** | Puppy adoptions, big floppy ears, tongue‑out selfies, and “sad dog” eyes when they want a treat. | Kitten videos of tumbling, tiny paws, “binky” moments (when they hop on hind legs), and the classic “sleep‑sprawl.” |

### What People Usually Pick

- **Dog lovers** often cite loyalty, the “always‑happy‑to‑see‑you” vibe, and the opportunity to be an active companion (walking, hiking, etc.) as reasons they find dogs cuter.
- **Cat lovers** often point to the sleek elegance, the soothing purr, and the occasional bursts of playful mischief that feel like a secret shared between the cat and its human.

### A Fun Way to Decide for Yourself

1. **Watch a short video** of a puppy and a kitten side‑by‑side. Notice which one makes you smile, coo, or feel an instant “aww.”  
2. **Imagine a day in the life** with each: a dog greeting you at the door with a wagging tail vs. a cat greeting you with a slow blink and a gentle head‑butt. Which scenario feels more heart‑warming?  
3. **Consider your lifestyle**: Do you have time for frequent walks and outdoor play (dog‑friendly) or prefer low‑maintenance companionship that enjoys quiet lap‑time (cat‑friendly)?

### Bottom Line

There’s **no objective winner**—the “cuter” animal is the one that captures your heart. Many people love both (think of the classic “dog‑cat duo” in memes!) and find each species adorable in its own way. So whether you’re team **Paw‑fect Pup** or **Purr‑fect Kitty**, the world is richer (and definitely cuter) because we have both.

**If you’re still undecided, try spending a little time with a friendly dog and a mellow cat.** The one that gives you the most warm fuzzies in that moment is probably your personal cuteness champion—at least for today!
EXPERT
answered a month ago
  • Thank you for testing, but your script doesn't include toolConfig, it only tests text streaming, which works fine for us too. The bug is specifically about tool input deltas being garbled.

    Here's a boto3 repro with tool calling:

    import boto3, json
    
    client = boto3.client("bedrock-runtime", region_name="us-east-1")
    payload = {
        "modelId": "openai.gpt-oss-120b-1:0",
        "messages": [{"role": "user", "content": [{"text": "What is the weather in New York?"}]}],
        "toolConfig": {
            "tools": [{"toolSpec": {"name": "getWeather", "description": "Get weather",
                "inputSchema": {"json": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}}}}],
            "toolChoice": {"any": {}},
        },
    }
    
    res = client.converse(**payload)
    tool = next(b["toolUse"] for b in res["output"]["message"]["content"] if "toolUse" in b)
    print(f"converse: {json.dumps(tool['input'])}")
    
    stream = client.converse_stream(**payload)
    deltas = []
    for e in stream["stream"]:
        if "contentBlockDelta" in e and "toolUse" in e["contentBlockDelta"]["delta"]:
            deltas.append(e["contentBlockDelta"]["delta"]["toolUse"].get("input", ""))
    print(f"converse_stream: {repr(''.join(deltas))}")

    Output:

    • converse: {"city": "New York"}
    • converse_stream: '{\n":"\n'

    Same script with zai.glm-5: both APIs return valid JSON. Confirmed with 3 SDKs (boto3, JS SDK, Vercel AI SDK) — the corruption is server-side and specific to openai models.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.