DEPRECATED: Support for this was disabled in websites/privacy-sensitive-analytics#21
Background
websites/privacy-sensitive-analytics#18 implemented a pixel based endpoint so that hits could be collected without collecting information about the user's browser.
Usage
The calling page should embed the image endpoint
<img src="https://[url]/count.gif" >
Assuming the Referer
header is available, the system will then collect which domain + page the image was embedded into.
Cardinality
Because the system records so little information about the requests, there's a strong possibility for simultaneous requests to overwrite one another.
To counter this, the system generates a unique request id based upon Nginx internal information
ngx.var.connection, -- Nginx connection ID
ngx.var.connection_requests, -- how many requests have used this connection
ngx.var.pid -- pid of Nginx
This yields an ID of the form
3316936-1-28791
Whilst this prevents points overwriting one another, it also results in extremely high cardinality within the database.
This identifier should be stripped when downsampling with aggregates
Downsampling
The collected metrics can be downsampled with a simple Flux task
option task = {
name: "downsample_hitcounter",
every: 15m,
offset: 1m,
concurrency: 1,
}
out_bucket = "websites/analytics"
host="http://192.168.3.84:8086"
token=""
sourcedata = from(bucket: "telegraf/autogen", host: host, token: token)
|> range(start: -task.every)
|> filter(fn: (r) => r._measurement == "pf_analytics_test_pixel")
|> drop(columns: ["sess"])
|> aggregateWindow(every: 15m, fn: sum)
|> map(fn: (r) => ({ r with
_field: "hitcount",
_measurement: "pf_analytics_pixel"
}))
|> drop(columns: ["_start", "_stop", "type"])
|> to(bucket: out_bucket, host: host, token: token)