At the moment, (for each index type) we write values out to a temporary index file then swap it into place. It's then read in and generally assumed to be correct.
This leaves us with a few issues
index.tmp
- it's not safe for two processes to try and build an index at the same timeThere are, undoubtedly, other issues lurking in there too. This issue is being raised to find and fix those as well as the above (obviously, any showstoppers should get their own issues)
Activity
21-Jan-24 10:33
assigned to @btasker
21-Jan-24 10:34
This one was actually partially addressed yesterday in d2335032533a89448f008ea32ce8b6a0beb0d172
The only issue is, the commit has a bug in it:
It's comparing the index to itself (rather than the temporary file) so will never update the index. Fixing that in a moment.
21-Jan-24 10:36
I think what I'd like to do, is to add a "header" line to the index - a single line containing metadata about the index.
Amongst those could be a checksum (it probably doesn't need to be much more than a CRC32) of the rest of the data, perhaps along with a line count.
That way, after loading, we can check whether we've loaded the data that the index claims to contain.
21-Jan-24 10:45
mentioned in commit 9dc24de2b6ec22279a2d0ea8ab4d90f3a8c7fab4
Message
fix: compare index to temporary file, not to itself (see utilities/file_location_listing#43)
21-Jan-24 11:25
mentioned in commit 4bb054be80e97c32b2cbf5754d5fc2c5cca76cf7
Message
feat: Read hash from index file header (utilities/file_location_listing#43)
21-Jan-24 11:32
mentioned in commit e51765f06f2e7c1ca08d8969ad94f71c4cd6b3a6
Message
feat: add header to tags index (utilities/file_location_listing#43)
21-Jan-24 12:06
mentioned in commit 43323a2bc057d0d236f4c311d81c7c4cc7ae9ac6
Message
feat: Read index headers at load and compare checksums (utilities/file_location_listing#43)
21-Jan-24 12:07
The most recent commit adds stats on checksum mismatches:
We obviously want that to always be 0. We can get a failure rate by dividing by
index.loads
21-Jan-24 12:13
OK, obviously needs testing before we can think about cutting a release, but I think we're probably ready to merge that branch.
21-Jan-24 12:15
mentioned in merge request !4
21-Jan-24 12:16
mentioned in commit 05c9703b15bfe4c5440baa2bca74933b1f3e5d8b
Message
feat: Introduce index headers
feat: prefix header with fixed value When reading, this allows us to test the first character of the first line to see if it's a header (ensuring we don't bork if loading an older index)
feat: Read index headers at load and compare checksums ([utilities/file_location_listing#43](/issue/utilities/file_location_listing/43.html))
hashIndex()
21-Jan-24 12:21
There are two ways to go at this (though not strictly mutually exclusive):
Option 1 is much simpler to implement but means we can still get unexpected behaviour - if an index build runs slow (for whatever reason) and another fast run starts, we might ultimately end up overwriting a new up-to-date index with an out of date one (when the slow run completes).
Option 2 prevents that, because the second run simply wouldn't start. It's the same outcome but with less resource wastage but does leave open the possibility of deadlock (we could perhaps expose a stat to show the presence of locks, allowing alerting to be implemented).
21-Jan-24 14:04
I think, for now, making sure the filename is unique will be more than sufficient.
I don't want to over-engineer this either. Whilst we'd give a strong guarantee of uniqueness by importing something like
uuid
(orshortuuid
) adding that tostorage
would mean the server importing those at runtime for no good reason.Whilst we could inline the import, for our purposes, it's probably actually enough to call
time.time_ns()
and use that as a filename suffix.21-Jan-24 14:06
mentioned in commit 7c847cc75720decf264862c411a03067c0399866
Message
fix: apply a unique(ish) suffix to temp files (utilities/file_location_listing#43)
21-Jan-24 14:14
Talking of locking though, there is potential for a race with the tombstone:
indexStorage()
runsWhen the new index is loaded, it'll include an item that no longer exists in storage. Whilst
storage
should handle this gracefully, it's still less than ideal.There's a bit of a catch-22 here though:
What we probably need to do is
Either way, I think this probably deserves it's own issue - raised as #44