Description
We're about to roll out LogsDB for all integrations. LogsDB uses synthetic _source. The result is that _source may differ from the original one in several ways. For example, the ordering of arrays is not preserved and values in an array are de-duplicated (internally arrays are stored in a sorted set).
I'd like to propose that ECS defines which for which fields the ordering is important, so that store_array_source
should be enabled. This comes with a storage overhead but allows us to return the original values.
An example for a field where the ordering is important is process.args
:
Lines 143 to 153 in 5376570
The ordering isn't always important. For example, I'd consider the storage tradeoff for process.thread.capabilities.permitted
to not be worth it. What matters here is the set of capabilities a thread permits, not in which order.
Lines 205 to 215 in 5376570