Currently aggregates work by building a list of values which can then be operated on. There's, obviously, a memory cost to this.
Whilst this processing is inavoidable for aggregates like mean
, there are circumstances where it may be wasted effort.
Neither first
nor last
need a list of all values (in fact, first
doesn't even need to iterate all of them).
If they're used in conjunction with other aggregates then we'll still need to build a list of values, but if they're used alone it'd be good if we could avoid that processing.
Last can be tracked quite simply:
for foo in bar:
last_val = foo['_value']
first
is even easier:
for foo in bar:
if num_aggregates == 1:
first = foo['value']
break
The difficulty comes in detecting it - we'll probably want to populate some kind of state when loading config.
Activity
06-Feb-23 09:01
assigned to @btasker
06-Feb-23 09:04
mentioned in commit abe22c46e937d25a2d0b1a13652694dab522edd7
Message
feat: Count aggregates when loading config
This could be considered prep for utilities/python_influxdb_downsample#12
It also introduces a new configuration attribute:
vars
This provides an area that we can push auto-calculated variables into without polluting the outer config namespace.
It's not intended to be human-editable and so shouldn't appear in the config file itself
06-Feb-23 09:23
mentioned in commit 90fa038e8a97cdf5f977790610bea4da94c9df54
Message
Short-circuit iteration if first is the only active aggregator utilities/python_influxdb_downsample#12
If
first
is the only active aggregator we'll only process the first record in each table, knowing the rest will be discarded anyway06-Feb-23 09:25
mentioned in commit 9e543c986ddbbf2626e74392edb14a2c7a50120b
Message
Add short-circuit if last is the only active aggregator utilities/python_influxdb_downsample#12
06-Feb-23 09:26
This was simpler than expected.
06-Feb-23 13:29
One thing I wondered was whether we want to allow for complex usage too.
For example, if
first
andlast
are used together, we can still make efficiency savings in thelast
processing (but notfirst
).But, doing so means adding quite a lot of complexity to the loop (because of the way we currently implement the short-circuit we'd need additional tracking of the first value for this use-case).
Having both
first
andlast
active (and without other aggregates) should be pretty rare, so it doesn't seem worth introducing overhead to cater to it.