8000 Add set operations to `@immut/hash{map, set}` and `@internal/sparse_array` Summary by Asterless · Pull Request #2145 · moonbitlang/core · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add set operations to @immut/hash{map, set} and @internal/sparse_array Summary #2145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
May 30, 2025

Conversation

Asterless
Copy link
Contributor

Related Issue

Fixes #1830: Efficient set operations for @immut/hash{map, set}

Changes

@immut/hashmap

  • union_with(f: (K, V, V) => V)
    Merges two hashmaps, resolving key conflicts with a custom function f.
  • intersection()
    Returns a new hashmap containing keys present in both input maps, with values from the first map.
  • intersection_with(f: (K, V, V) => V)
    Computes intersection, resolving overlapping keys' values with function f.
  • difference()
    Returns entries present in the first map but not in the second.

@immut/hashset

  • intersection()
    Returns a new set containing elements common to both input sets.
  • difference()
    Returns elements present in the first set but not in the second.

@internal/sparse_array

  • intersection()
    Computes index-wise intersection of two sparse arrays.
  • difference()
    Computes index-wise difference between two sparse arrays.

Motivation

These changes provide a more complete and consistent set of set operations for immutable collections, making it easier to perform common set algebra tasks and improving API parity across collection types.

Tests

Added and updated unit tests for all new and modified methods to ensure correctness and expected behavior.

Checklist

All new and existing tests pass
Code is formatted and documented where appropriate

Copy link
peter-jerry-ye-code-review bot commented May 21, 2025
Inconsistent naming pattern in method declarations

Category
Maintainability
Code Snippet
pub fn[K : Eq + Hash, V] union(self : T[K, V], other : T[K, V]) -> T[K, V]
pub fn[K : Eq + Hash, V] T::union(self : T[K, V], other : T[K, V]) -> T[K, V]
Recommendation
Consistently use T:: prefix for all methods. Update the union function to match other new methods using T:: prefix.
Reasoning
The codebase shows mixed usage of method declaration styles between old and new code. Using consistent T:: prefix improves readability and maintains uniform method declaration pattern across the codebase.

Redundant checks in hashset intersection

Category
Performance
Code Snippet
pub fn[K : Eq + Hash] T::intersection(self : T[K], other : T[K]) -> T[K] {
match (self, other) {
...
(Branch(sa1), Branch(sa2)) => {
let res = sa1.intersection(sa2, fn(m1, m2) { m1.intersection(m2) })
if res.size() == 0 { Empty } else { Branch(res) }
}
Recommendation
Remove redundant size check and directly return Branch(res), as empty sparse arrays are already handled by the Empty pattern match.
Reasoning
The size check is unnecessary since empty results are already handled by the Empty constructor in the match pattern. Removing it simplifies the code and eliminates a redundant operation.

Missing documentation for type constraints in hashmap intersection operation

Category
Correctness
Code Snippet
pub fn[K : Eq + Hash, V] T::intersection_with(
self : T[K, V],
other : T[K, V],
f : (K, V, V) -> V
) -> T[K, V]
Recommendation
Add documentation explaining the type constraints (Eq + Hash) and f function requirements, for example:
///| Combines overlapping entries using function f.
///| K must implement Eq + Hash traits.
///| f receives the key and both values, returns combined value.
Reasoning
Type constraints and function parameters need clear documentation to help users understand requirements and correct usage. This is especially important for generic operations with special constraints.

@coveralls
Copy link
Collaborator
coveralls commented May 21, 2025

Pull Request Test Coverage Report for Build 7010

Details

  • 52 of 103 (50.49%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.5%) to 92.068%

Changes Missing Coverage Covered Lines Changed/Added Lines %
immut/internal/sparse_array/sparse_array.mbt 15 16 93.75%
immut/hashset/HAMT.mbt 7 26 26.92%
immut/hashmap/HAMT.mbt 30 61 49.18%
Files with Coverage Reduction New Missed Lines %
bench/stats.mbt 1 85.48%
Totals Coverage Status
Change from base Build 7006: -0.5%
Covered Lines: 8798
Relevant Lines: 9556

💛 - Coveralls

@Asterless Asterless marked this pull request as ready for review May 21, 2025 21:39
@bobzhang bobzhang force-pushed the feature/20250522_HAMT branch from 44c0a79 to eb6ef82 Compare May 22, 2025 01:12
@peter-jerry-ye peter-jerry-ye requested a review from Lampese May 22, 2025 02:46
Asterless and others added 2 commits May 22, 2025 11:10
Add set operations to @immut/hash{map, set} and @internal/sparse_array Summary and re-fmt
@bobzhang bobzhang requested a review from Guest0x0 May 25, 2025 01:00
@CAIMEOX CAIMEOX merged commit ea0d2ef into moonbitlang:main May 30, 2025
12 checks passed
@FlyCloudC
Copy link
Contributor
FlyCloudC commented May 30, 2025

The issues I reported seem to still exist in the merged PR. Here's the test evidence showing the problematic behavior

///|
priv type MyInt Int derive(Eq, Show)

///|
impl Hash for MyInt with hash_combine(_, _) {
  panic()
}

///|
impl Hash for MyInt with hash(self) {
  self._
}

///|
test {
  let m1 = new().add(MyInt(0b1_00001), 1)
  let m2 = m1.add(MyInt(0b11_00001), 1)
  inspect(m1.to_array(), content="[(MyInt(33), 1)]")
  inspect(m2.to_array(), content="[(MyInt(33), 1), (MyInt(97), 1)]")
  inspect(m2.difference(m1).to_array(), content="[]") // should be [(MyInt(97), 1)]
}

Another example without MyInt:

test {
  let m1 = for i = 0, m1 = new(); i < 100; {
    continue i + 1, m1.add(i, i)
  } else {
    m1
  }
  let m2 = m1.add(200, 200)
  inspect(m2.difference(m1), content="@immut/hashmap.of([])") // should be [(200, 200)]
}

@Asterless Asterless deleted the feature/20250522_HAMT branch May 30, 2025 16:52
@Asterless
Copy link
Contributor Author

Sorry, it seems that I mistakenly associated the issue. Please ignore my operation🙂

@FlyCloudC
Copy link
Contributor
FlyCloudC commented May 31, 2025

Ah, just to be clear - I'm talking about the code review comments I left on this pull request above, not about any independent issue ticket in the repo.

image

Asterless added a commit to Asterless/core that referenced this pull request Jun 7, 2025
…arse_array` (moonbitlang#2145)

* feat: add four new functions to HAMT and their tests

* fix(union_with): fix union_with for HAMT,now it can handle branch

* feat(sparse_array): add intersection and difference methods

* fix(HAMT): fix union_with, intersection, intersection_with and difference methods, now they can handle branchs

* feat(hashset): add intersection and difference methods to hashset

* commit other files

* feat: add four new functions to HAMT and their tests

* fix(union_with): fix union_with for HAMT,now it can handle branch

* feat(sparse_array): add intersection and difference methods

* fix(HAMT): fix union_with, intersection, intersection_with and difference methods, now they can handle branchs

* feat(hashset): add intersection and difference methods to hashset

* style: change the position of some function declarations

* fix: fix formatting of the code

* feat: Update the function declarations of hash tables and sparse arrays

* refactor:update mbti

* refactor: update hashmap and hashset function signatures to include type prefix

---------

Co-authored-by: 东灯 <me@lampese.com>
Sign up for free to join 67E6 this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Efficient set operations for @immut/hash{map, set}
6 participants
0