project Utilities / File Location Listing avatar

utilities/file_location_listing#51: Search by new tag not working until portal restarted



Issue Information

Issue Type: issue
Status: closed
Reported By: btasker
Assigned To: btasker

Milestone: vnext
Created: 17-Apr-24 11:43



Description

I've just tagged up a bunch of files to make them easily discoverable with a single tag search.

After a recrawl, if I search for them by filename they show up (and the UI shows they have that tag). However, a search for that tag comes up empty (literally 0 results).

The tag is definitely in the tag index (and I can see the relevant files listed there).

  • If I take the mode off exact it still doesn't show up.
  • If I search for a substring within the tag (with matchtype: tag) it doesn't show up

To see if there's some kind of issue with index reloading, I've rolled the portal pod. Ahh, the results now come through.

So the answer is probably one of the following

  • There's an issue with index reloads not reloading the tag index
  • I've been impatient and the portal hadn't actually reloaded yet


Toggle State Changes

Activity


assigned to @btasker

I've been impatient and the portal hadn't actually reloaded yet

It doesn't look like this is the case - my dashboard implies that the indexes had correctly reloaded

Screenshot_20240417_124517

(it's 0 for two poll intervals there - one for the original reload, and one for just now when I rolled the pod).

So that does imply that the in-memory copy of the tag index may not have correctly been updated.

Should be easy enough to build a repro if that's the case

Repro:

Created a simple file:

# Test file

----
### Tags

#tag1 #tag2 #mytag

----

blah blah blah

Searched for it:

Screenshot_20240419_182732

The file will now be in cache though, so stopping the server and starting it again.

Updated the file to add #tagno3 and recrawled

REINDEX=Y ./crawler/app/crawler.py

Now when I search by filename, the tag appears in the result

Screenshot_20240419_183207

That makes sense, because it's been loaded from the storage file rather than the index. What we need to see now is whether it's possible to search by that tag

Screenshot_20240419_183320

It is.

I think I've an idea what happened in the earlier case.

The files hadn't previously been searched for/loaded and therefore will not have been in the filestore cache.

So, when searching for something that matched their filename, they'll have been loaded from storage including the details of that tag. That allowed the tag to display in results.

However, it wasn't possible to receive results when searching for the tag because INDEX_CHECK_INTERVAL had not elapsed and, therefore, the tag index had not yet been reloaded - I'm guessing my search must've been a little before the reload seen in the graph.

I'm going to close this as invalid - if there's any further sign, though, we can reopen.