Direkt zum Inhalt

AWS IOT MQTT Test Client unicode emoji in message payload gets converted from UTF-8 bytes to \u encoding

0

I performed this test on the console's AWS IoT MQTT Test Client:

  1. I subscribed to the shadow/update/accepted topic for my thing
  2. I published a payload containing a unicode house emoji 🏡 in UTF-8 encoded as F0 9F 8F A1 to the shadow/update topic for my thing
  3. But the payload on the shadow/update/accepted topic contained the emoji in UTF-16 escape u format: "\uD83C\uDFE1"

My question is why is AWS converting from UTF-8 to UTF-16?

My problem is that I have code that checks the payload received on the shadow/update/accepted topic is the same as the payload sent on the shadow/update topic and it detects an error if they are different.

NOTE: when testing this on the MQTT Test Client make sure when subscribing and publishing that the Additional Configuration / MQTT payload display option is set to "Display payloads as strings (more accurate)", and not "Auto-format JSON payloads (improves readability)". To see the raw data choose the option "Display raw payloads (displays binary data as hexadecimal values)".

In detail:

1 Subscribe to $aws/things/my_thing/shadow/update/accepted

2 Publish to $aws/things/my_thing/shadow/update

{"state":{"reported":{"name":"🏡"}}}

The raw data for this message shows the emoji encoded in UTF-8 f09f8fa1: 7b227374617465223a7b227265706f72746564223a7b226e616d65223a22f09f8fa1227d7d7d

3 Payload on $aws/things/my_thing/shadow/update/accepted topic shows the emoji encoded in UTF-16 \u format:

{"state":{"reported":{"name":"\uD83C\uDFE1"}},"metadata":{"reported":{"name":{"timestamp":1744211708}}},"version":1970,"timestamp":1744211708}

I expected the payload to be identical on both topics.

I tried publishing to the shadow/update topic with the MQTT v5 Payload Format Indicator set to "UTF-8" and to "Binary" and the results are the same. But I notice if I publish to shadow/update with the Payload Format Indicator set to a value then the message on the shadow/update/accepted topic does not have a Payload Format Indicator. I expected the Payload Format Indicator to be passed from publisher to subscriber.

Thanks for any help you can give.

gefragt vor einem Jahr133 Aufrufe
1 Antwort
0
  • I tried publishing to the shadow/update topic with the MQTT v5 Payload Format Indicator set to "UTF-8" and to "Binary" and the results are the same. But I notice if I publish to shadow/update with the Payload Format Indicator set to a value then the message on the shadow/update/accepted topic does not have a Payload Format Indicator. I expected the Payload Format Indicator to be passed from publisher to subscriber. This part is a known issue. Shadow never supported MQTT5.

  • Also, we do convert to Unicode (UTF-16) in the Shadow when processing the topic messages. Please use unicode (UTF-16) encoding for your messages as workaround

AWS
SUPPORT-TECHNIKER
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.