I recently moved www.bentasker.co.uk over from using Joomla to a static site generator (websites/BEN#3).
When doing the move, I didn't implement any analytics hooks and have since decommed the Piwik/Matamao system I was using (jira-projects/CDN#8)
However, that's come at the cost of not having a good overview of where traffic is coming from (or even where it's going to - most of my sites are served via CDN and I deliberately do not receive full access logs).
What I'd like is an analytics system which can collect some of this information, without overcollecting.
So, we want to capture
Activity
15-Dec-21 16:27
assigned to @btasker
15-Dec-21 16:28
I do find it interesting to see where users are located, at least at the country level.
We can't geolocate them without calling external services (or submitting the IP as part of the payload/getting direct connections) though.
But, we could use some JS to extract the timezone they've got set - that's probably granular enough to satisfy my curiosity and helps reduce cardinality
15-Dec-21 17:09
My current thinking is:
That way writes can be batched off-net but still ultimately written to my local InfluxDB. Having the LUA in the middle helps guard against potential malicious input, as well as allowing type to be enforced.
15-Dec-21 17:29
mentioned in issue jira-projects/CDN#8
15-Dec-21 17:41
So, as it's simplest, we might have the JS post a JSON payload, something like
Looks like we can get the TZ with
And Platform can be pulled from
navigator.platform
15-Dec-21 19:39
mentioned in commit b2a3bdc00d7b9af8adc7b94cc5e22943db6efeda
Message
We have a working PoC for websites/privacy-sensitive-analytics#1
This implements
An agent which collects
The agent submits via a simple JSON payload, received via an Openresty server block and processed by LUA.
LUA processes the JSON and reformats into InfluxDB line protocol for writing into either Telegraf or InfluxDB
29-Dec-21 11:04
Closing this as done - v0.1 has been released