project Utilities / File Location Listing avatar

utilities/file_location_listing#44: Potential Tombstone Race



Issue Information

Issue Type: issue
Status: closed
Reported By: btasker
Assigned To: btasker

Milestone: v0.2.6
Created: 21-Jan-24 14:14



Description

There is potential for a race with the tombstone:

  1. indexStorage() runs
  2. new index is compiled
  3. A delete is triggered for an entry that's already been examined, removing file and updating tombstone
  4. new index is written out
  5. tombstone is deleted

When the new index is loaded, it'll include an item that no longer exists in storage. Whilst storage should handle this gracefully, it's still less than ideal.

There's a bit of a catch-22 here though:

  • If we remove the tombstone first, searches run whilst reindexing will lose the benefit of the tombstone (at least, if the index reloads in the meantime)
  • If we leave it until afterwards, we run into the issue above

What we probably need to do is

  1. take a copy of the tombstone
  2. do the indexing operation
  3. Check whether the main tombstone has changed on disk
  4. if not, remove it and the copy
  5. if so, identify changes and write only those out (or, maybe take the easy route and don't remove the tombstone - it'll likely get rationalised away next reindex)

The other option is to employ a lockfile, preventing updates of the tombstone (we'd then either need to refuse deletes or perhaps redirect into a temporary tombstone for the locking process to read in when releasing the lock).



Toggle State Changes

Activity


assigned to @btasker

mentioned in issue #43

The other option is to employ a lockfile, preventing updates of the tombstone (we'd then either need to refuse deletes or perhaps redirect into a temporary tombstone for the locking process to read in when releasing the lock).

I think I quite like this idea - in particular, redirecting writes to a temporary tombstone.

The problem with refusing deletes is that it'd be reliant on the deleter coming back and trying again later, whereas redirecting into a temporary file means they'll be applied as soon as possible (we could, of course, still push a warning of some form).

The question, of course, is whether temporary tombstones should be consumed at index load time. Doing so would mean that deletes still come into effect, at the cost of (some) extra complexity in loading.

The question is whether that extra complexity is worth the effort - reindexing is generally quite a quick operation (and deletes are rare) so it's not like it should take too long for them to come into effect anyway.

verified

mentioned in commit ab5d306367f83060f49d04022e6bd0aa56ecf670

Commit: ab5d306367f83060f49d04022e6bd0aa56ecf670 
Author: B Tasker                            
                            
Date: 2024-01-21T14:36:40.000+00:00 

Message

chore: move tombstone management to dedicated file (prep for utilities/file_location_listing#44)

+82 -19 (101 lines changed)

OK, I've moved tombstone management to a dedicated module so that it's all in one place.

We need to implement locking, but there are a few considerations:

  • Deletion occurs in a different processes, so a lock needs to be represented on the filesystem
  • We can't use flock: my pods are backed by NFS which doesn't support it

The simplest way is almost certainly a lockfile

verified

mentioned in commit 9d536d2d1411a07257412c7fcc292a94f2f89e6e

Commit: 9d536d2d1411a07257412c7fcc292a94f2f89e6e 
Author: B Tasker                            
                            
Date: 2024-01-21T15:22:12.000+00:00 

Message

feat: direct writes into temporary file whilst locked (utilities/file_location_listing#44)

When unlocking, temporary writes are then replayed

+82 -9 (91 lines changed)
verified

mentioned in commit c64f4629a6abec74514bb8cff83f3ca8a6c890b6

Commit: c64f4629a6abec74514bb8cff83f3ca8a6c890b6 
Author: B Tasker                            
                            
Date: 2024-01-21T15:30:38.000+00:00 

Message

fix: lock the tombstone during reindexing (utilities/file_location_listing#44)

+5 -1 (6 lines changed)

The issue described in the description is now fixed.

Technically, we've moved it a little - there's no an slight race within unlocking (where a writer might write to the temporary tombstone just after we've read it, but just before we remove it).

Unfortunately, this is probably inavoidable unless we're happy to have the a lock block writes (which we don't want).

It'd also require a double failure (we'd need to receive writes not just whilst the tombstone was locked, but also just as we were unlocking it).

So, I think the answer is probably not to worry too much about that unless it actually happens.