Currently receipt files don't get regenerated/updated, so the "Last Seen" will only be accurate when the alert fired.
It is possible to manually force a regeneration (by setting DRY_RUN
to Y
and removing the state files), but
It's the final point that bothers me most, we don't want to incorrectly set flags because something changed in config.
This ticket is to record some of that state so that it can be used when a regeneration is run.
#1 | Track Receipt State |
Activity
20-Jan-23 22:38
assigned to @btasker
29-Jan-23 11:52
The best place to start is probably to figure out what we want to record for future reference.
I think, in practice, there are two groups - "track" and "update on change" - where the latter is tracked, but also recalculated with any new items added to the tracked changes.
Update on Change
rDNS
: might be useful/interesting to check if it changes over time. We want it to be current, but also don't want to lose useful information if a bot author realises a mistake and removes the PTR recordASN
: Useful to track history - if it changes, it might act as a prompt to check whether the bot is actually still activeTor Exit node
: Should track historyLast Seen
: this should update automatically, but we should also track state - if we stop seeing requests (and expire old logs out), it'd be helpful to have a record of when it was last seenAverage number of daily requests
Observed useragents
Observed Paths
Flags
Track
First Seen
: Want to make sure the earliest date remains the sameRealistically, that's most points. So, it probably makes as much sense to dump the entire receipt object as a state-file, and then worry about the logic above during regeneration.
29-Jan-23 11:53
mentioned in issue misc/python-mastodon-snitch-bot#1
29-Jan-23 12:04
marked this issue as related to misc/python-mastodon-snitch-bot#1
29-Jan-23 12:08
The bot will now keep track of state.
misc/python-mastodon-snitch-bot#2 will track implementation of receipt refreshing/regeneration.
Will also need to come up with a solution for existing receipt files - there are few enough that it might just be a case of populating their state by hand, but we'll cross that bridge when we come to it.
29-Jan-23 15:15
Regeneration is now largely implemented.
ASN history will not currently be updated in the receipts file as it would mean special changes to handle the
ipinfo
link. Although AS changes are a potential signal, they're not one that most admins are likely to care about, so it seemed safe to defer this.Regeneration will however record any AS changes in the internal state, so if we later want to reflect these changes in the receipt file we'll be able to show changes between now and then - the information isn't lost, it just isn't displayed in the receipt file.
Tracking of all other items is implemented.
29-Jan-23 15:20
The next challenge, though, lies in populating state for all existing bots.
I had hoped that running the bot across a large time period would do the job, unfortunately that longer period allows other IPs to cross the reporting threshold, so we've gained about 50% more files.
So, we need to separate out those that have already crossed the threshold, so they can be updated and checked (the first generation date will be wrong, but can be pulled from the original receipts). The additional ones should be checked to see whether they point towards any necessary rule tweaks
Took a listing of state off the bot's host
Copied to to my test dir
Need to work through them now
29-Jan-23 15:47
Files have been manually corrected and deployed.
I think, given the manual work involved, it's best to cut the release now - the longer it's delayed, the more likely new bots will appear and need state manually generating.
29-Jan-23 15:51
A docker image has been built for
v0.13
and deployed onto the host.A wrapper has been created for it, and scheduled in cron so that regenerations happen every 12 hours (at 10 past the hour)
I've manually triggered it and run a diff off the receipt files to check there's nothing crazy happening.
29-Jan-23 15:51
changed title from Save State {-for Receipt Regeneration-} to Save State {+and regenerate Receipts+}
29-Jan-23 15:51
changed the description