Wiki: Subnet Matches/project-management-only / Scraper Snitch Bot



Subnet matches were introduced in project-management-only/scraper-snitch-bot#9 in order to try and address the noise caused by Meta's crawlers.

Because of the size of the IPv6 address space, Meta's crawlers are able to connect from a wide range of IPs, with each ultimately triggering a notification toot. That's not necessarily Meta playing fast and loose so much as a reflection of how IPv6 addresses tend to be allocated.

The subnet matching functionality is something of a quick hack, but should serve to reduce this noise.

Within the config, I define known subnets:

grouped_prefixes:
  - 2a03:2880::/32
  - 2620:0:1c00::/40

If a misbehaving IP falls within one of these ranges, a few things happen

There will be no re-toot if a bot from another IP within that subnet comes along and misbehaves - the assumption is that users will have blocked the subnet as a result of the earlier alert.

The toot text will also change to refer to a subnet rather than an IP.


Shortcomings

There are a few known shortcomings with this approach:

But, this is probably outweighed by the benefits of reducing alert fatigue.