forked from rapidsai/raft
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] branch-23.04 from rapidsai:branch-23.04 #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pull
wants to merge
15
commits into
vishalbelsare:branch-23.04
Choose a base branch
from
rapidsai:branch-23.04
base: branch-23.04
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The main changes are: - Add a one-block version. It uses single thread block for one row of a batch and is used when `len` is relatively small (<= 16384) - Avoid writing candidates to buffers when the number of candidates is larger than buffer length. - Add a parameter to control whether to use a fused filter in the last pass or use a standalone filter kernel. The later case is preferable when the leading bits of inputs are almost same. - Early stopping: when the target bucket contains `k` values, we can stop the computation earlier - Many implementation details are polished, like the initialization of `counter`, calculation of kernel launch parameters, and the scan step - Tests and benchmarks are updated to include the new implementations. New benchmarks are added to demonstrate the advantage of adaptive version. Authors: - Yong Wang (https://github.com/yong-wang) - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Tamas Bela Feher (https://github.com/tfeher) URL: #1175
This is a copy and modification of a user's project but I think this is going to be generally useful to users as the same types of challenges are going to come up again. In this case, the user wasn't able to build/link because they weren't using `rapids-cmake` to propagate important configuration settings. I think having a skeleton project available that we build in CI and keep up to date will help new users build more applications on RAFT. TODO: - [x] Make building the template optional - [x] Verify this can build in CMake and reuse already built/installed bits - [x] Add to docs / readme and reference in README.md - [x] Add a little example of invoking an API (maybe `pairwise_distances`?) to `main()` Authors: - Corey J. Nolet (https://github.com/cjnolet) - Ben Frederickson (https://github.com/benfred) Approvers: - Micka (https://github.com/lowener) - Dante Gama Dessavre (https://github.com/dantegd) - Divye Gala (https://github.com/divyegala) - AJ Schmidt (https://github.com/ajschmidt8) URL: #1312
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Ben Frederickson (https://github.com/benfred) URL: #1373
This should cut compilation time for refine_d_int64_t_float.cu.o et al from ~900 seconds to 29 seconds. The refine specialization contain >100 instances of the ivfflat_interleaved_scan kernel, even though these should be seperately compiled by the ivfflat_search specializations. The call to ivf_flat_interleaved_scan is [here](https://github.com/rapidsai/raft/blob/56ac43ad93a319a61073dce1b3b937f6f13ade63/cpp/include/raft/neighbors/detail/refine.cuh#L121). Depends on (so please merge after) PR #1307. Authors: - Allard Hendriksen (https://github.com/ahendriksen) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1360
Add the ability for a user to specify an epilogue function to run after the distance in the brute_force::knn call. This lets us remove faiss from cuml, by updating the hdbscan reachability code (rapidsai/cuml#5293) Authors: - Ben Frederickson (https://github.com/benfred) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1371
New `bench/ann` artifact for comparing (C++ APIs for) GPU-acclerated algorithms end-to-end. Working on this w/ @tfeher but had to squash the original commits into a single commit. Things left to do: - [x] Separate `benchmarks` executables for each different algorithm - [x] Separate build targets for `ggnn` and `hnswlib` - [x] Revise `bench/ann` docs - [x] Break `factory.cuh` abd `benchmark.cu` / `benchmark.cpp` into individual files for each different algorithm to make it easier to plug in new algorithms. - [x] Separate into its own conda package closes #1211 Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Ray Douglass (https://github.com/raydouglass) - Ben Frederickson (https://github.com/benfred) - Tamas Bela Feher (https://github.com/tfeher) URL: #1304
Closes #1377 Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) URL: #1378
Add `--time` option to `build.sh` that enables compile time logging of `nvcc`. Also, add a script `cpp/scripts/analyze_nvcc_log.py` to find the translation units that take the longest time. Output looks like: ``` $ cpp/scripts/analyze_nvcc_log.py cpp/build/nvcc_compile_log.csv -- loading data -- analyzing data -- Ten longest translation units: phase index file cicc cudafe++ fatbinary gcc (compiling) gcc (preprocessing 1) gcc (preprocessing 4) ptxas total time 0 10 ions/detail/canberra_double_double_double_int.cu 42.431063 10.601856 0.020979 6.747153 3.721194 2.093567 1618.390375 1684.006186 1 11 zations/detail/canberra_float_float_float_int.cu 36.928960 9.804138 0.011537 6.796088 3.481156 1.790703 1584.262875 1643.075457 2 85 ors/specializations/refine_d_uint64_t_uint8_t.cu 602.935531 14.980877 0.529673 36.300566 6.270717 2.889723 933.622969 1597.530056 3 84 bors/specializations/refine_d_uint64_t_int8_t.cu 606.513281 16.243960 0.729282 39.981113 5.608029 3.028493 897.241469 1569.345628 4 53 stance/neighbors/ivfpq_search_int8_t_uint64_t.cu 841.049750 8.233967 1.025554 24.248578 4.069022 1.747108 631.193734 1511.567713 5 52 istance/neighbors/ivfpq_search_float_uint64_t.cu 837.241437 8.145278 1.042313 24.400606 3.433528 1.882623 627.786672 1503.932457 6 54 tance/neighbors/ivfpq_search_uint8_t_uint64_t.cu 846.706656 8.371286 1.025517 24.094691 3.432749 1.645345 618.319234 1503.595479 7 76 izations/detail/ivfpq_search_uint8_t_uint64_t.cu 698.726266 7.086368 1.050021 39.727723 3.259101 1.333935 406.509937 1157.693351 8 74 alizations/detail/ivfpq_search_float_uint64_t.cu 706.702516 6.905794 1.049731 39.923895 2.814361 2.057154 395.604000 1155.057450 9 75 lizations/detail/ivfpq_search_int8_t_uint64_t.cu 689.390281 6.483386 1.025864 39.865668 3.121696 1.297788 409.099562 1150.284245 10 83 hbors/specializations/refine_d_uint64_t_float.cu 334.705594 15.466444 0.680270 36.551977 5.405133 2.947568 715.708781 1111.465767 -- Plotting absolute compile times -- Wrote absolute compile time plot to cpp/build/nvcc_compile_log.csv.absolute.compile_times.png -- Plotting relative compile times -- Wrote relative compile time plot to cpp/build/nvcc_compile_log.csv.relative.compile_times.png ``` Authors: - Allard Hendriksen (https://github.com/ahendriksen) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1262
This PR will also automatically generate an HTML report in `conda-cpp-build` CI runs under the task `Upload Additional Artifacts` Authors: - Divye Gala (https://github.com/divyegala) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Corey J. Nolet (https://github.com/cjnolet) URL: #1369
cupy just released version 12, which we do not yet support, so wheels CI must be pinned to require a lower version Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Sevag H (https://github.com/sevagh) URL: #1383
Due to some typos in #1304, the benchmarks are not built any more. I have fixed this in this PR. I think it is important that this PR goes into 23.04, otherwise we will not have benchmarks for two months in the stable release. Authors: - Allard Hendriksen (https://github.com/ahendriksen) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1389
Switches raft over to the upcoming raft_export(`COMPONET`) support. This allows raft to not require to support custom exporting logic Authors: - Robert Maynard (https://github.com/robertmaynard) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1154
This PR pins `dask` and `distributed` to `2023.3.2` and `2023.3.2.1` respectively for `23.04` release. xref: rapidsai/cudf#13070 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Ray Douglass (https://github.com/raydouglass) - Ben Frederickson (https://github.com/benfred) - Joseph (https://github.com/jolorunyomi) URL: #1399
This PR adds CAGRA, a graph based method for nearest neighbor search. Authors: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) - Ben Frederickson (https://github.com/benfred) Approvers: - Ben Frederickson (https://github.com/benfred) - Divye Gala (https://github.com/divyegala) URL: #1375
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sp 10000 onsor : )