FPSim2 is a small NumPy centric Python/C++ package to run fast compound similarity searches. FPSim2 performs better with high search thresholds (>=0.7). Currently used in the ChEMBL interface.
Highlights:
- Using a fast population count algorithm from libpopcnt libpopcnt
- Bounds for sublinear speedups from 10.1021/ci600358f
- A compressed file format with optimised read speed based in PyTables and BLOSC
- Fast multicore CPU and GPU similarity searches
- In memory and on disk search modes
- Distance matrix calculation
From source:
- clone the repo
pip install FPSim2/
From a conda environment:
conda install -c efelix fpsim2
Documentation is available at https://chembl.github.io/FPSim2/
To try out FPSim2 interactively in your web browser, just click on the binder