Currently, we don't load the analytics agent on error pages.
It does mean, though, that we currently aren't gaining any visibility on where there might be broken links (internal or external).
So, we should add a means to record (and report on) 404s (in particular).
Activity
29-Mar-22 07:41
At it's simplest, we probably want to use the standard collection mechanism and just use a different
state
than normal (i.e. notPageView
).It would mean some of the graphing needs adjusting - some of the queries simply exclude video play events, we'd need to swap those over to being inclusive rather than exclusive.
29-Mar-22 07:43
Equally, there's an argument for introducing a
status_code
tag - whilst we're only really focused on 404's in this ticket, it'd allow easy future expansion into collecting for other status codes.We'd still want to review graphing queries, but that's likely to be true either way (and we probably do want some of the graphs to include 404s).
29-Mar-22 07:46
The bit that does concern me though, is the cardinality implications.
page
is a tag, so it's values are part of the series key. With that only containing valid page paths there's a finite (even if high) limit to the cardinality is can cause. If we includepage
for 404's then we're essentially opening that up.Conversely, there really isn't an awful lot of point in collecting 404s if we don't record the path they were for as that prevents us from investigating and fixing links/implementing redirects etc.
What we might want to do then, is to record 404s under a different measurement - the downside of that is adding more complexity to the server side LUA.
29-Mar-22 12:15
There are advantages on either side.
If we mix the 404's in, then it's easier to run off a graph showing response statuses.
But, if we mix them in, we then have to update a bunch of graphs and add complexity to the downsampling script.
I think it'd be better to write into a seperate measurement. Beyond counts, I can't see it being information that we'd want to keep long term, so don't really want it mixed in with the standard downsample.
29-Mar-22 12:29
mentioned in commit 5f3b3dcb01fab03339407c5e30ff0a4f11b39973
Message
Add server side support for logging 404s for websites/privacy-sensitive-analytics#11
29-Mar-22 12:36
mentioned in commit ac1620d25415e7a3a293f1f6ecb367cbf5acdf7f
Message
Add agent support for reporting 404s for websites/privacy-sensitive-analytics#11
Error pages should set
After the agent script has been embedded
29-Mar-22 13:14
This can be enabled by including the following in the
<head>
of error pages29-Mar-22 13:20
A count over time can be extracted with
29-Mar-22 13:23
The last 50 404's can be listed with
29-Mar-22 15:28
mentioned in commit 8f08e25d19ff8c0c70b60cefbdf37c78ac541e3c
Message
Downsample 404 stats for websites/privacy-sensitive-analytics#11
29-Mar-22 16:17
404 counts are now included in downsampling and reflected in the historic dashboard