As part of the work looking into legal basis (project-management-only/scraper-snitch-bot#2), it's been identified that there's an additional control that can be put in place to help mitigate the impact of mistakes.
Toots and receipt publication should only happen during UK daytime.
This is to ensure that, if the bot makes a mistake, it doesn't go out at 0100
and remain unfixable until I wake up hours later.
Activity
20-Jan-23 08:43
assigned to @btasker
20-Jan-23 08:47
This is primarily a case of adjusting crontab, but also need to make sure that the time period passed into the main SQL query accounts for the gap in reporting.
What we don't want is to have something like this
2200
: Query last 4 hours2300
: Query last 4 hours0800
: Query last 4 hours0900
: Query last 4 hoursBecause there'll be a significant window of time that isn't accounted for. The time period used in the query needs to be at least as long as the gap in runtimes. If query period is being adjusted, then the minimum number of requests may also need adjusting.
The other, more complex, alternative is to move to a queue based model: rather than tooting/publishing, the analysis bot would write details into a queue for a third process to collect. That third process would be restricted to daytime hours, whilst the analysis bot would just carry on about it's business.
That does rather feel like over-engineering though.
20-Jan-23 08:47
mentioned in issue #2
20-Jan-23 19:59
Looking at it, I think we can safely stretch the query period out as far as about 12 hours without encountering issues.
I'd like to do a few dry-run tests before actually doing that though
22-Jan-23 14:06
Have run some test queries using the wider threshold and there don't seem to be any negative ramifications - there aren't any bots which suddenly slip from having too few requests to having enough.
22-Jan-23 14:09
Cron has been updated:
The query period has been updated to
12 hour
in the wrapper script.