RFC: `approx_count_distinct` state table encoding #3414

jon-chuang · 2022-06-23T03:34:24Z

RFC Template

Background

We need to consider approx_count_distinct state table encoding to facilitate:

Recovery from checkpoint upon node failure
Recovery from checkpoint upon config change actor placement

Design

Encoding of `approx_count_distinct`

key: table_id | bucket_id (16 bits) | count_index (8 bits) (count_index ranges from 0-48 and represents trailing zeroes)
value: count (u64)

bucket_id (16 bits) | count_index (8 bits) forms the PK for the item.

Seems like vnode does not factor into the encoding since we are assuming that approx_count_distinct is a simple agg singleton operator.

Future Optimizations

We could consider approx_count_distinct with group_by. Because of the large default in-memory state size of approx_count_distinct, we may only be able to support a small number of group_by keys.

Discussions

Q&A

The text was updated successfully, but these errors were encountered:

fuyufjh · 2022-06-23T08:16:05Z

An alternative design is to store all counts of a bucket in one KV-entry, the size would be 92 bytes and seems okay to me.

fuyufjh · 2022-06-23T08:18:39Z

I am also thinking about whether my design is good, because the memory usage it quite high (~1.5MB). This would be a problem for both 2-phase agg and HashAgg. Perhaps we should only provides an append-only approx_count_distinct? 🤔

jon-chuang · 2022-06-23T09:59:54Z

Hmm, indeed.

One way is to use the bucket_id as distribution key to calculate vnode for 2-phase agg. So the state size is split n-way amongst n actors.

The partial agg can compute partial results of the harmonic mean.

As for hash agg, we may not need to support it. Or, we could throw a runtime error if the number of group by keys grows too large?

fuyufjh · 2022-06-24T07:20:06Z

One way is to use the bucket_id as distribution key to calculate vnode for 2-phase agg. So the state size is split n-way amongst n actors.

That's a very interesting insight! But it seems hard to be used along with other agg functions.

fuyufjh · 2022-06-24T07:24:04Z

As for hash agg, we may not need to support it. Or, we could throw a runtime error if the number of group by keys grows too large?

Yes, this is a feasible solution. I am just not sure whether this is what our users want. Anyway, the approx_count_distinct is expected to be a lightweight (but approximate) alternative for count(distinct), but it seems not lightweight and even cause more limitations.

fuyufjh · 2022-06-24T07:24:50Z

We may hold this RFC for a while, I think

jon-chuang · 2022-06-28T10:02:55Z

but it seems not lightweight and even cause more limitations.

Yes, indeed. It is not lightweight when the distinct elements are small in number and group keys are large in number. I think we may need the sparse-dense transition to handle this.

fuyufjh · 2022-09-28T10:14:44Z

I have added back this issue to the project board, to remind that we have to need to make a decision on whether we decide to support or disable approx_count_distinct, especially the non-append-only one.

fuyufjh · 2022-09-28T10:16:08Z

The difficulty is: the state, as described in the RFC, is very special compared with these aggregate functions that we have supported (e.g. min/max or sum/count)

jon-chuang · 2022-10-17T04:54:10Z

Note that one has to persist a different state per vnode to enable consistent scaling.

xxchan · 2023-03-29T15:12:46Z

What's the status of this RFC?

stdrc · 2025-04-24T09:40:05Z

What's the status of this RFC?

Current status of approx_count_distinct is:

It can work on append-only stream, for both hash agg and simple agg. The state is encoded into a condensed int array as one datum in the "intermediate state table". The encoding/decoding process is not that efficient. I haven't compare the performance with normal count(distinct x) though.
It has an implementation for non-append-only stream, can count correctly if no recovery is involved. By recovery, I mean, actor rebuilding for simple agg, and, group cache eviction or actor rebuilding for hash agg. It encode its state by simply store the result value in the "intermediate state table", and when recovering, it simply use that result as "initial count" which is summed with the HLL result for new inputs.

As said in previous discussion, we may consider not to allow this function in hash agg, because the state can explode ridiculously. When there's no many rows within each group, with a correctly implemented non-append-only state, there're fixed 65536 rows for each group, and the value column is of type int64[]. At the same time, a normal count(distinct x) requires only M rows in the state where M is the number of distinct values of x, with value column to be a simple int64 as the duplicate count of each distinct value. It seems not reasonable to use approx_count_distinct anyway when doing a grouped aggregation.

If we do decide to only support this agg for simple agg, we can fearlessly have a state table storing each bucket as one row, then we can have a correct non-append-only implementation.

jon-chuang added the type/feature Type: New feature. label Jun 23, 2022

neverchanje added the no-issue-activity label Aug 28, 2022

chenzl25 mentioned this issue Sep 28, 2022

Bug(stream): StreamingApproxCountDistinct should persistent buckets #5616

Closed

fuyufjh added this to the release-0.1.14 milestone Sep 28, 2022

st1page mentioned this issue Oct 17, 2022

Tracking(approx_count_distinct): append-only approx_count_distinct #5871

Closed

This was referenced Oct 23, 2022

feat(approx count distinct): append-only approx count distinct agg and its state #5976

Merged

feat(approx distinct): add in-memory append-only approx_count_distinct #5958

Merged

fuyufjh removed this from the release-0.1.14 milestone Oct 25, 2022

stdrc self-assigned this Apr 24, 2025

stdrc added the A-agg Area: Aggregate. label May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: `approx_count_distinct` state table encoding #3414

RFC: `approx_count_distinct` state table encoding #3414

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RFC: approx_count_distinct state table encoding #3414

RFC: approx_count_distinct state table encoding #3414

Comments

Uh oh!

RFC Template

Background

Design

Encoding of approx_count_distinct

Future Optimizations

Discussions

Q&A

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RFC: `approx_count_distinct` state table encoding #3414

RFC: `approx_count_distinct` state table encoding #3414

Encoding of `approx_count_distinct`