Description
I need to perform calculations on data within a single day. Due to the large volume of daily data and the requirement to perform real-time aggregation every 15 minutes, I face some challenges. If I partition the data using time_bucket(interval '1 day', data_date), all data from the entire day is placed into a single bucket, and real-time aggregation would aggregate all the data, resulting in inefficiencies.
On the other hand, if I partition the data using time_bucket(interval '15 min', data_date), the data will be divided into 96 buckets, which allows for refreshing the data but also requires re-aggregating the data in all these buckets.
Therefore, I wanted to ask if there is a better way to aggregate the entire day’s data every 15 minutes such that each aggregation is incremental and not a full re-aggregation of all the data.
This is my question, I look forward to your reply.
I hope this accurately captures your concerns and requirements.