utilities/gadgetbridge_to_influxdb#8: Syncing Issues



Issue Information

Issue Type: issue
Status: closed
Reported By: btasker
Assigned To: btasker

Milestone: v0.4
Created: 26-Aug-23 15:53



Description

I've had a weird sync issues a couple of times now. Raising this for tracking here as I don't want to spam the Gadgetbridge tracker whilst I try different things.

When the sync issue happens, Gadgetbridge continues to sync PAI, Stress etc. But, it no longer fetches heart-rate or steps (or, I assume sleep).

I originally noted it in an upstream ticket.

The Gadgetbridge log shows

Start date unsuccessful response: 0x10 0x01 0x02

Which the GB dev has never seen happen before.

The first time it happened, I unpaired the device (via Android's bluetooth settings) then re-paired and rediscovered in Gadgetbridge.

Yesterday morning, in order to test the changes in the new nightly, I removed the watch from Gadgetbridge, updated the nightly, and then re-discovered and re-paired. It all appeared to work - the Step counter dial appeared in Gadgetbridge's UI.

This afternoon though, I noticed that my data was out of date - the latest heart-rate data I was seeing was from this morning.

When I checked GB's log

14:01:38.925 [Binder:22963_1] WARN n.f.g.s.d.h.o.AbstractFetchOperation - Start date unsuccessful response: 0x10 0x01 0x02

Although the step counter dial was visible in GB, the value it was displaying was hours out of date and didn't line up with what was on the watch.

This time, unpairing and re-pairing didn't resolve the issue. I had to power-off the watch, unpair and clear GB's database to get sync back and working (of course, it might also be that repeated attempts would have got us there in the end).

Raising this ticket to track digging in further



Toggle State Changes

Activity


assigned to @btasker

I also had a look at behaviour over the last couple of days

image

The jumps to 200 occur where there's a gap in the data. It doesn't look like this is something that's isolated to this morning - there was a blip yesterday evening and a long gap overnight.

It's hard to say for sure, though, where a data-gap started and ended. GB queries back in time, so some of the data we're seeing as present might well have been collected later, hiding a gap.

In order to better track this, utilities/gadgetbridge_to_influxdb#7 introduces two new fields

  • last_seen: the NS timestamp of the most recent point in the Gadgetbridge DB
  • last_seen_age: The delta (also in NS) between last_seen and time of collection

There is a fairly distinct possibility that this isn't an issue with Gadgetbridge at all.

In utilities/zepp_to_influxdb#8 I noted that syncing with the official app was unreliable, with the watch's data not being synced for long periods of time without manual intervention.

That intervention isn't quite as extreme has having to unpair and re-pair, but it might be that Zepp does something under the hood to mitigate whatever this is.

So, plan going forwards:

  • I'm going to cut a release so that I can get utilities/gadgetbridge_to_influxdb#7 into use
  • I want to monitor over the next day or so to see where gaps seem to occur, and how long (if at all) they seem to take to recover. I'll try to be hyper-aware of where my phone is during that time (i.e. did I walk out of range of my phone etc etc)
  • Once we've got an idea of frequency, I want to configure Tasker to periodically force a sync (by sending a nodomain.freeyourgadget.gadgetbridge.command.ACTIVITY_SYNC intent) in case this is something half-timing out
verified

mentioned in commit ae14b4b72cf05fcb67a80f404a105f251b71cd48

Commit: ae14b4b72cf05fcb67a80f404a105f251b71cd48 
Author: B Tasker                            
                            
Date: 2023-08-26T17:03:52.000+01:00 

Message

Quick tidy up. Cutting a temporary release for utilities/gadgetbridge_to_influxdb#8

This release will not be tagged and won't be pushed to docker hub

+2 -4 (6 lines changed)

Image built as registry.bentasker.co.uk/utilities/gadgetbridge_to_influxdb:iss-8

mentioned in commit sysconfigs/bumblebee-kubernetes-charts@8939574d52c648c8a2906b3d5d9cf39e3fc54542

Commit: sysconfigs/bumblebee-kubernetes-charts@8939574d52c648c8a2906b3d5d9cf39e3fc54542 
Author: ben                            
                            
Date: 2023-08-26T17:05:49.000+01:00 

Message

Update chart to use temporary release (utilities/gadgetbridge_to_influxdb#8)

+1 -1 (2 lines changed)

This may have been in vain, it looks like Jose has narrowed in on the cause.

It seems to be tied to the Truncate fetch operation timestamps option - I've got a fairly reliable repro of the sync issues now:

  • Hit disconnect in GB
  • Toggled Truncate back off
  • Checked steps in GB (4386)
  • Waved my arm around wildly
  • Checked steps on watchface (4406)
  • Hit connect in GB
  • Hit Sync
  • Steps stay at 4386
  • Hit Sync a few times, no change in steps
  • Turned Truncate back on
  • Hit sync
  • Steps update to 4406

This is totally reproducible, I've now done it several times - it'll fail to sync activity data until I turn Truncate on.

Is GB sending the last seen timestamp to the watch during sync? I'm wondering if the issue is that I've (presumably) changed that setting after the first data sync occurs, so it ends up sending an incorrect TS over?

Explanation from Jose:

What that truncate option does is remove the seconds, so it only sends yyyy-mm-dd HH:mm. The problem with this is, if there's a very short workout (< 1 minute), the time will never advance to the next minute, so it becomes impossible to fetch further workouts. I have added that truncate option to all Huami devices, but only tested it on a few and it's very possible that it does not work correctly on all of them.

I am going to remove it from the Bip 3 Pro, since it's now clear that it does not work correctly.

I'll leave the temporary release running just in case there's another issue - hopefully shouldn't be too far from a proper release anyway.

I'm about to test the new Gadgetbridge nightly (because it enables some experimental features). Assuming there are no syncing issues after upgrading to that, I'll close this off as the issue's been identified and resolved upstream.

Looks good to me - had a brief moment where I thought syncing wasn't working, but it's because my attempts at faking exercise were too short: Gadgetbride truncates the timestamp to a 1 minute granularity, so some steps need to be registered in the subsequent minute before they'll show up (it does all catch up).