๋‚ด์šฉ์œผ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

AWS IOT MQTT Test Client unicode emoji in message payload gets converted from UTF-8 bytes to \u encoding

0

I performed this test on the console's AWS IoT MQTT Test Client:

  1. I subscribed to the shadow/update/accepted topic for my thing
  2. I published a payload containing a unicode house emoji ๐Ÿก in UTF-8 encoded as F0 9F 8F A1 to the shadow/update topic for my thing
  3. But the payload on the shadow/update/accepted topic contained the emoji in UTF-16 escape u format: "\uD83C\uDFE1"

My question is why is AWS converting from UTF-8 to UTF-16?

My problem is that I have code that checks the payload received on the shadow/update/accepted topic is the same as the payload sent on the shadow/update topic and it detects an error if they are different.

NOTE: when testing this on the MQTT Test Client make sure when subscribing and publishing that the Additional Configuration / MQTT payload display option is set to "Display payloads as strings (more accurate)", and not "Auto-format JSON payloads (improves readability)". To see the raw data choose the option "Display raw payloads (displays binary data as hexadecimal values)".

In detail:

1 Subscribe to $aws/things/my_thing/shadow/update/accepted

2 Publish to $aws/things/my_thing/shadow/update

{"state":{"reported":{"name":"๐Ÿก"}}}

The raw data for this message shows the emoji encoded in UTF-8 f09f8fa1: 7b227374617465223a7b227265706f72746564223a7b226e616d65223a22f09f8fa1227d7d7d

3 Payload on $aws/things/my_thing/shadow/update/accepted topic shows the emoji encoded in UTF-16 \u format:

{"state":{"reported":{"name":"\uD83C\uDFE1"}},"metadata":{"reported":{"name":{"timestamp":1744211708}}},"version":1970,"timestamp":1744211708}

I expected the payload to be identical on both topics.

I tried publishing to the shadow/update topic with the MQTT v5 Payload Format Indicator set to "UTF-8" and to "Binary" and the results are the same. But I notice if I publish to shadow/update with the Payload Format Indicator set to a value then the message on the shadow/update/accepted topic does not have a Payload Format Indicator. I expected the Payload Format Indicator to be passed from publisher to subscriber.

Thanks for any help you can give.

์งˆ๋ฌธ๋จ ์ผ ๋…„ ์ „133ํšŒ ์กฐํšŒ
1๊ฐœ ๋‹ต๋ณ€
0
  • I tried publishing to the shadow/update topic with the MQTT v5 Payload Format Indicator set to "UTF-8" and to "Binary" and the results are the same. But I notice if I publish to shadow/update with the Payload Format Indicator set to a value then the message on the shadow/update/accepted topic does not have a Payload Format Indicator. I expected the Payload Format Indicator to be passed from publisher to subscriber. This part is a known issue. Shadow never supported MQTT5.

  • Also, we do convert to Unicode (UTF-16) in the Shadow when processing the topic messages. Please use unicode (UTF-16) encoding for your messages as workaround

AWS
์ง€์› ์—”์ง€๋‹ˆ์–ด
๋‹ต๋ณ€ํ•จ ์ผ ๋…„ ์ „

๋กœ๊ทธ์ธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋กœ๊ทธ์ธํ•ด์•ผ ๋‹ต๋ณ€์„ ๊ฒŒ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ข‹์€ ๋‹ต๋ณ€์€ ์งˆ๋ฌธ์— ๋ช…ํ™•ํ•˜๊ฒŒ ๋‹ตํ•˜๊ณ  ๊ฑด์„ค์ ์ธ ํ”ผ๋“œ๋ฐฑ์„ ์ œ๊ณตํ•˜๋ฉฐ ์งˆ๋ฌธ์ž์˜ ์ „๋ฌธ์ ์ธ ์„ฑ์žฅ์„ ์žฅ๋ คํ•ฉ๋‹ˆ๋‹ค.

๊ด€๋ จ ์ฝ˜ํ…์ธ