Skip to content

BasicStation AWS IoTCore disconnect after "Repeated excesive clock drift"

0

Hello,

I’m using a RAK2287 with a RPi4 along with BasicStation to create a gateway device to deliver sensor data through to AWS IoTCore with CUPS authentication.

After starting my gateway it seems as if things are going well, but after a few minutes I get a number of ‘time sync rejected’ messages and sometimes some Repeated excessive clock drifts between MCU/SX130X#0 messages before the station disconnects and restarts.

I'm not sure what's causing this: whether it's on the AWS side, or the gateway side; triggered by AWS or triggered by the gateway; if it relates to time sync or if it's just coincidental.

Some logs below..

  
2021-10-08 12:55:48.920 \[SYS:INFO] Logging     : stderr (maxsize=10000000, rotate=3)  
2021-10-08 12:55:48.920 \[SYS:INFO] Station Ver : 2.0.5(corecell/std) 2021-10-05 15:02:16  
2021-10-08 12:55:48.921 \[SYS:INFO] Package Ver : (null)  
2021-10-08 12:55:48.921 \[SYS:INFO] proto EUI   : 0:dca6:3205:f1a        (/sys/class/net/eth0/address)  
2021-10-08 12:55:48.921 \[SYS:INFO] prefix EUI  : ::1    (builtin)  
2021-10-08 12:55:48.921 \[SYS:INFO] Station EUI : dca6:32ff:fe05:f1a  
2021-10-08 12:55:48.921 \[SYS:INFO] Station home: ./     (builtin)  
2021-10-08 12:55:48.921 \[SYS:INFO] Station temp: /var/tmp/      (builtin)  
2021-10-08 12:55:49.123 \[TCE:INFO] Starting TC engine  
2021-10-08 12:55:49.124 \[any:INFO] ./tc.trust:  
cert. version     : 3  
serial number     : A7:0E:4A:4C:34:82:B7:7F  
issuer name       : C=US, O=Starfield Technologies, Inc., OU=Starfield Class 2 Certification Authority  
subject name      : C=US, ST=Arizona, L=Scottsdale, O=Starfield Technologies, Inc., CN=Starfield Services Root Certificate Authority - G2  
issued  on        : 2009-09-02 00:00:00  
expires on        : 2034-06-28 17:39:16  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic con2021-10-08 12:55:49.132 \[any:INFO] ./tc.crt:  
cert. version     : 3  
serial number     : 70:B9:E0:68:0E:62:85:AC:A4:7F:0D:0C:81:3D:6D:06:D7:95:C0:AE  
issuer name       : OU=Amazon Web Services O=Amazon.com Inc. L=Seattle ST=Washington C=US  
subject name      : CN=AWS IoT Certificate  
issued  on        : 2021-10-05 13:52:18  
expires on        : 2049-12-31 23:59:59  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic constraints : CA=false  
key usage         : Digital Signature  
2021-10-08 12:55:49.132 \[AIO:INFO]  
2021-10-08 12:55:51.807 \[TCE:INFO] Connecting to INFOS: wss://A3OQ74V8656RDU.lns.lorawan.us-west-2.amazonaws.com:443  
2021-10-08 12:55:51.807 \[CUP:INFO] Starting a CUPS session in 0 seconds.  
2021-10-08 12:55:51.807 \[CUP:INFO] Starting a CUPS session now.  
2021-10-08 12:55:51.807 \[CUP:INFO] Connecting to CUPS ... https://A3OQ74V8656RDU.cups.lorawan.us-west-2.amazonaws.com:443 (try #1)  
2021-10-08 12:55:51.808 \[any:INFO] ./cups.trust:  
cert. version     : 3  
serial number     : A7:0E:4A:4C:34:82:B7:7F  
issuer name       : C=US, O=Starfield Technologies, Inc., OU=Starfield Class 2 Certification Authority  
subject name      : C=US, ST=Arizona, L=Scottsdale, O=Starfield Technologies, Inc., CN=Starfield Services Root Certificate Authority - G2  
issued  on        : 2009-09-02 00:00:00  
expires on        : 2034-06-28 17:39:16  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic c2021-10-08 12:55:51.816 \[any:INFO] ./cups.crt:  
cert. version     : 3  
serial number     : 2F:B2:92:89:7F:3C:5C:D7:89:CB:36:E6:5A:1B:82:93:A7:9D:AF:B6  
issuer name       : OU=Amazon Web Services O=Amazon.com Inc. L=Seattle ST=Washington C=US  
subject name      : CN=AWS IoT Certificate  
issued  on        : 2021-09-27 10:25:44  
expires on        : 2049-12-31 23:59:59  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic constraints : CA=false  
key usage         : Digital Signature  
2021-10-08 12:55:51.816 \[AIO:INFO]  
2021-10-08 12:55:51.928 \[CUP:VERB] Retrieving update-info from CUPS https://A3OQ74V8656RDU.cups.lorawan.us-west-2.amazonaws.com:443...  
2021-10-08 12:55:53.746 \[AIO:DEBU] \[4] HTTP connection shutdown...  
2021-10-08 12:55:53.746 \[CUP:INFO] Interaction with CUPS done (no updates) - next regular check in 1d  
2021-10-08 12:55:53.832 \[TCE:INFO] Infos: dca6:32ff:fe05:0f1a dca6:32ff:fe05:0f1a wss://A3OQ74V8656RDU.lns.lorawan.us-west-2.amazonaws.com:443/gateway/dca632fffe050f1a  
2021-10-08 12:55:53.832 \[AIO:DEBU] \[3] ws_close reason=1000  
2021-10-08 12:55:53.832 \[AIO:DEBU] \[3|WS] Server sent close: reason=1006  
2021-10-08 12:55:53.832 \[AIO:DEBU] \[3] WS connection shutdown...  
2021-10-08 12:55:53.832 \[any:INFO] ./tc.trust:  
cert. version     : 3  
serial number     : A7:0E:4A:4C:34:82:B7:7F  
issuer name       : C=US, O=Starfield Technologies, Inc., OU=Starfield Class 2 Certification Authority  
subject name      : C=US, ST=Arizona, L=Scottsdale, O=Starfield Technologies, Inc., CN=Starfield Services Root Certificate Authority - G2  
issued  on        : 2009-09-02 00:00:00  
expires on        : 2034-06-28 17:39:16  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic con2021-10-08 12:55:53.835 \[any:INFO] ./tc.crt:  
cert. version     : 3  
serial number     : 70:B9:E0:68:0E:62:85:AC:A4:7F:0D:0C:81:3D:6D:06:D7:95:C0:AE  
issuer name       : OU=Amazon Web Services O=Amazon.com Inc. L=Seattle ST=Washington C=US  
subject name      : CN=AWS IoT Certificate  
issued  on        : 2021-10-05 13:52:18  
expires on        : 2049-12-31 23:59:59  
signed using      : RSA with SHA-256  
RSA key size      : 2048 bits  
basic constraints : CA=false  
key usage         : Digital Signature  
2021-10-08 12:55:53.835 \[AIO:INFO]  
2021-10-08 12:55:53.878 \[TCE:VERB] Connecting to MUXS...  
2021-10-08 12:55:54.964 \[TCE:VERB] Connected to MUXS.  
2021-10-08 12:55:55.169 \[S2E:WARN] Unknown field in router_config - ignored: protocol (0xFD309030)  
2021-10-08 12:55:55.169 \[S2E:WARN] Unknown field in router_config - ignored: regionid (0xE6FFB211)  
2021-10-08 12:55:55.171 \[SYS:VERB] rinit.sh: Forked, waiting...  
CoreCell reset through GPIO17...  
CoreCell power enable through GPIO18...  
2021-10-08 12:55:56.140 \[SYS:INFO] Process rinit.sh (pid=886) completed  
ERROR: Failed to stop TX trigger  
ERROR: Failed to stop TX trigger  
ERROR: failed to close I2C device (err=-1)  
2021-10-08 12:55:56.140 \[RAL:INFO] Lora gateway library version: Version: 1.0.5;  
2021-10-08 12:55:56.146 \[RAL:VERB] Connecting to device: /dev/spidev0.0  
2021-10-08 12:55:56.146 \[RAL:DEBU] SX130x txlut table (16 entries)  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  0:  rf_power=12 pa_gain=0 pwr_idx=15  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  1:  rf_power=13 pa_gain=0 pwr_idx=16  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  2:  rf_power=14 pa_gain=0 pwr_idx=17  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  3:  rf_power=15 pa_gain=0 pwr_idx=19  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  4:  rf_power=16 pa_gain=0 pwr_idx=20  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  5:  rf_power=17 pa_gain=0 pwr_idx=22  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  6:  rf_power=18 pa_gain=1 pwr_idx=1  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  7:  rf_power=19 pa_gain=1 pwr_idx=2  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  8:  rf_power=20 pa_gain=1 pwr_idx=3  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut  9:  rf_power=21 pa_gain=1 pwr_idx=4  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 10:  rf_power=22 pa_gain=1 pwr_idx=5  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 11:  rf_power=23 pa_gain=1 pwr_idx=6  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 12:  rf_power=24 pa_gain=1 pwr_idx=7  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 13:  rf_power=25 pa_gain=1 pwr_idx=9  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 14:  rf_power=26 pa_gain=1 pwr_idx=11  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 txlut 15:  rf_power=27 pa_gain=1 pwr_idx=14  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 rssi_tcomp: coeff_a=0.000 coeff_b=0.000 coeff_c=20.410 coeff_d=2162.560 coeff_e=0.000  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 rxrfchain 0: enable=1 freq=867.5MHz rssi_offset=-215.399994 type=5 tx_enable=1  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 rxrfchain 1: enable=1 freq=868.5MHz rssi_offset=-215.399994 type=5 tx_enable=0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  0: enable=1 rf_chain=0 freq=-400000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  1: enable=1 rf_chain=0 freq=-200000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  2: enable=1 rf_chain=0 freq=0 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  3: enable=1 rf_chain=0 freq=200000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  4: enable=1 rf_chain=0 freq=400000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  5: enable=1 rf_chain=1 freq=-400000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  6: enable=1 rf_chain=1 freq=-200000 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  7: enable=1 rf_chain=1 freq=0 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  8: enable=1 rf_chain=1 freq=-200000 bw=5 SF=7 sync_word=0/0 \[STD] Explicit header  
2021-10-08 12:55:56.146 \[RAL:VERB] SX1302 ifchain  9: enable=1 rf_chain=0 freq=0 bw=0 SF=0 sync_word=0/0  
2021-10-08 12:55:56.146 \[RAL:VERB] Station device: /dev/spidev0.0 (PPS capture disabled)  
2021-10-08 12:55:58.575 \[RAL:VERB] Concentrator started (2s428ms)  
2021-10-08 12:55:58.575 \[S2E:INFO] Configuring for region: EU868 -- 863.0MHz..870.0MHz  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR0  SF12/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR1  SF11/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR2  SF10/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR3  SF9/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR4  SF8/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR5  SF7/BW125  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR6  SF7/BW250  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR7  FSK  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR8  undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR9  undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR10 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR11 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR12 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR13 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR14 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   DR15 undefined  
2021-10-08 12:55:58.575 \[S2E:VERB]   TX power: 0.0 dBm EIRP  
2021-10-08 12:55:58.575 \[S2E:VERB]   JoinEUI list: 0 entries  
2021-10-08 12:55:58.575 \[S2E:VERB]   NetID filter: FFFFFFFF-FFFFFFFF-FFFFFFFF-FFFFFFFF  
2021-10-08 12:55:58.575 \[S2E:VERB]   Dev/test settings: nocca=0 nodc=0 nodwell=0  
2021-10-08 12:56:25.206 \[S2E:VERB] RX 867.5MHz DR0 SF12/BW125 snr=-19.2 rssi=-114 xtime=0xB0000001A2303B - updf mhdr=40 DevAddr=0100DA30 FCtrl=D8 FCnt=21758 FOpts=\[BD0069F0009E437B] 8E8D31E6..0626 mic=1125777498 (52 bytes)  
2021-10-08 12:56:40.586 \[SYN:INFO] MCU/SX130X drift stats: min: _27.1ppm  q50: _30.9ppm  q80: _33.3ppm  max: _34.3ppm - threshold q90: +34.3ppm  
2021-10-08 12:56:40.586 \[SYN:INFO] Mean MCU drift vs SX130X#0: 30.8ppm  
2021-10-08 12:56:59.489 \[SYN:INFO] Time sync qualities: min=185 q90=213 max=213 (previous q90=2147483647)  
2021-10-08 12:57:22.599 \[SYN:VERB] Time sync rejected: quality=6398 threshold=213  
2021-10-08 12:57:24.699 \[SYN:INFO] MCU/SX130X drift stats: min: _22.4ppm  q50: _26.2ppm  q80: _29.0ppm  max: _30.5ppm - threshold q90: +30.4ppm  
2021-10-08 12:57:24.699 \[SYN:INFO] Mean MCU drift vs SX130X#0: 26.3ppm  
2021-10-08 12:57:45.705 \[SYN:VERB] Time sync rejected: quality=3486 threshold=213  
2021-10-08 12:58:02.508 \[SYN:INFO] Time sync qualities: min=188 q90=213 max=6398 (previous q90=213)  
2021-10-08 12:58:08.811 \[SYN:INFO] MCU/SX130X drift stats: min: _16.2ppm  q50: _22.4ppm  q80: _23.8ppm  max: _26.2ppm - threshold q90: +26.2ppm  
2021-10-08 12:58:08.811 \[SYN:INFO] Mean MCU drift vs SX130X#0: 22.1ppm  
2021-10-08 12:58:40.320 \[SYN:VERB] Time sync rejected: quality=6319 threshold=213  
2021-10-08 12:58:52.922 \[SYN:INFO] MCU/SX130X drift stats: min: _12.4ppm  q50: _19.0ppm  q80: _20.5ppm  max: _22.4ppm - threshold q90: +21.4ppm  
2021-10-08 12:58:52.922 \[SYN:INFO] Mean MCU drift vs SX130X#0: 18.7ppm  
2021-10-08 12:59:05.523 \[SYN:INFO] Time sync qualities: min=185 q90=212 max=6319 (previous q90=213)  
2021-10-08 12:59:20.229 \[SYN:VERB] Time sync rejected: quality=3559 threshold=212  
2021-10-08 12:59:25.586 \[AIO:DEBU] \[3] Connection closed unexpectedly  
2021-10-08 12:59:25.586 \[AIO:DEBU] \[3] WS connection shutdown...  
2021-10-08 12:59:25.587 \[TCE:VERB] Connection to MUXS closed in state 4  
2021-10-08 12:59:25.587 \[TCE:INFO] MUXS reconnect backoff 1s (retry 0)  
asked 4 years ago609 views
7 Answers
0

Turns out I just had to add a "pps": true line to the station.conf file.

answered 4 years ago
0

Glad that the issue has been solved. The time sync issue does exist. But it should not disconnect the gateway. We will be working to fix this time sync issue very soon.

AWS
answered 4 years ago
0

Hi Russ,

I've had to revisit this problem since we're seeing a repeated loss of connection to AWS when using a USB cellular modem which emulated as an eth interface.

Every ~3 minutes, we get this error:

2021-10-25 12:29:08.820 \\[AIO:DEBU] \[3] Connection closed unexpectedly  
2021-10-25 12:29:08.820 \\[AIO:DEBU] \[3] WS connection shutdown...  
2021-10-25 12:29:08.821 \\[TCE:VERB] Connection to MUXS closed in state 4  

I'm not sure if this is a problem on the AWS side, maybe something we have to configure, or if this is a problem with using a cellular modem. Do you have any insight here?

Thanks,
Craig

Edited by: CraigE on Oct 25, 2021 6:09 AM

answered 4 years ago
0

Hi Craig,

Is that all in the gateway's log? How do we know why the gateway is disconnecting so frequently?
Does this happen only recently or always?

If it relates to the AWS side, any error message from the gateway would be helpful, such as what caused the disconnect, etc.

But it looks like all the handshakes with AWS have passed. If the gateway initiates all the disconnections, checking with the gateway vendor should be the next step to go.

Thanks,
Russ

AWS
answered 4 years ago
0

Hi Russ,

This happens reliably when we are on cellular connection, with multiple cellular modules. The gateway appears to only disconnect if there has not been any LoRaWAN packets come through for about 3 minutes. If we have regular LoRaWAN packets on the gateway this seems to fend off the disconnect.

This is the only log information that shows at the time of disconnect that relates to the web socket. We get time sync messages, but we also get these on wifi and do not disconnect:

2021-10-25 12:31:33.493 \[SYN:INFO] Mean MCU drift vs SX130X#0: 7.8ppm
2021-10-25 12:31:39.796 \[SYN:INFO] Time sync qualities: min=237 q90=261 max=261 (previous q90=261)
2021-10-25 12:31:49.253 \[SYN:VERB] Time sync rejected: quality=262 threshold=261
2021-10-25 12:31:51.354 \[SYN:ERRO] No PPS pulse for ~84 secs
2021-10-25 12:31:53.456 \[SYN:VERB] Time sync rejected: quality=284 threshold=261
2021-10-25 12:32:10.266 \[SYN:VERB] Time sync rejected: quality=284 threshold=261
2021-10-25 12:32:20.772 \[SYN:INFO] MCU/SX130X drift stats: min: _1.0ppm q50: _6.7ppm q80: _10.5ppm max: _12.8ppm - threshold q90: +11.9ppm
2021-10-25 12:32:20.772 \[SYN:INFO] Mean MCU drift vs SX130X#0: 7.2ppm
2021-10-25 12:32:26.069 \[AIO:DEBU] [3] Connection closed unexpectedly
2021-10-25 12:32:26.069 \[AIO:DEBU] [3] WS connection shutdown...
2021-10-25 12:32:26.069 \[TCE:VERB] Connection to MUXS closed in state 4
2021-10-25 12:32:26.069 \[TCE:INFO] MUXS reconnect backoff 16s (retry 4)
2021-10-25 12:32:41.784 \[SYN:INFO] Time sync qualities: min=236 q90=262 max=284 (previous q90=261)
2021-10-25 12:32:42.070 \[any:INFO] ./tc.trust:

We weren't sure if there was something on the AWS side that may be kicking us off the web socket. I'm aware that basicstation has a keepalive TCP packet but despite me changing the regularity of these it doesn't seem to have an effect.

Edited by: CraigE on Oct 26, 2021 3:08 AM

answered 4 years ago
0

Hi Craig,

Thanks for the input. I will try to look into any possible cause for the disconnections. If you have a way to check with the gateway vendor, they probably can provide further investigation that lead to the potential final fix. I will get back to you after some investigation.

thanks,
Russ

AWS
answered 4 years ago
0

Hi Craig,

I've looked into the backend logs. I do not see any cloud-initiated disconnect logs. It will be very likely the disconnection is triggered by the gateway's firmware. One noticeable thing I see in the logs is that there were very frequent (every 20~30 seconds) TLS handshake requests coming from the gateways. AWS side showed no error.
Can you check more on that perspective?

Thanks,

AWS
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.