-
Notifications
You must be signed in to change notification settings - Fork 104
Sharing on patricia trees #4183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch introduces a new `union`-like function to the patricia tree (and containers) interface, `diff`. It also provides `_shared` and `_sharing` variants of both `diff` and `union` to exploit internal sharing, and a typed interface to help ensure sharing is not accidentally lost. The `diff` function is an `union`-like generalization of the existing `diff_domains`. `diff f m1 m2` is similar to `merge f m1 m2` with `f None (Some _) = None` and `f (Some v) None = Some v`. The `_shared` variants (`diff_shared`, really) are the main motivation for this patch, and have a transformative impact on performance: `diff_shared` and `union_shared` exploit physical equality to simplify `diff_shared f m m = empty` and `union_shared f m m = m` in constant time. They also ensure maximal sharing of the result with their first argument. The `_sharing` variants do not perform this optimisation but still ensure maximal sharing of the result with their first argument. They are useful to add or remove entries into a map while preserving sharing. The `Shared_set` and `Shared_map` are thin wrapper modules that only expose sharing-preserving functions to avoid accidental loss of sharing.
lines of Haskell's implementation; this would let us preserve sharing (see | ||
Haskell's [Data.IntMap] for details). *) | ||
let rec union f t0 t1 = | ||
let[@inline always] universal_merge ?shortcut ?check0 ?check1 ~only_left |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed on renaming check0
and check1
to something more sensible — maybe phys_eq_left
and phys_eq_right
?
|
||
let update_many f t0 t1 = | ||
let iv = is_value_of t0 in | ||
universal_merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following offline discussion, we should add check0
here for memory optimisation, and either document it as a guarantee or add update_many_sharing
with that guarantee.
Extends the #4073 PR.
This adds some important inlining annotations, and add a new helper function.
This removes Shared_set and map as the type is not obviously the good one.
It also add the update_and_extend function that is useful to optimize a quadratic case in dataflow